4 



DOCUMENT RESUME 



ED 230 618 



TM 830 460 



AUTHOR 
TITLE 

INSTITUTION 

SPONS AGENCY 
REPORT NO 
PUB MtE - 
CONTRACT 
NOTE 

PUB TYPE 

EDRS PRICE. 
DESCRIPTORS 



Brandt, David A.; And Others 

Development .of the national Assessment of Educational 
( Progress. 

American Institutes for Research in the Behavioral 
Sciences/ Palo Alto, Calif. ^ 
•National Inst* ot Education (ED) , Washington, DC. 
AIR-25900-11/82-ER 
Nov 82 

NIE-400-82-0015 

87p. V 
Reports - Evaluative/Feasibility (142) 

MF01/PC04 Plus Postage. 

^Educational Assessment; Educational Testing; \^ 
Elementary Secondary Education; ^Federal Programs; 
^Program Descriptions; ^Program Development; ^Program 
Improvement; P'sychometrics 
IDENTIFIERS ^National Assessment of Educational Progress 

ABSTRACT ^ 

This report discusses five issue areas in which it is 
believed substantial improvements in the National Assessment of 
Educational Progress (NAEP) tnight be achieved. The unifying themes 
among these issues are to increase the visibility of NAEP, its 
relevance to policymakers, and its utility to state and local 
agencies. The first of five substantive chapters deals with the ' 
critical need for an overall framework for NAEP objectives. The 
second chapter deals with the design of test administr£lt ion and 
focuses on the costs and benefits of a unified, integrated assessment 
given each year. The next chapter indicates that exercises are the 
wrong unit of analysis for NAEP and compares latent tifait and latent 
class approaches ta the development of meaning in assessment, 
pointing out the Special applicability of the latent class^^Viralysis 
for achievement indices. The fourth chapter reviews existing studies 
of computer-based testing, seeks out predictions of future 
technological advances, and proposes a gradual series of studies 
aimed at the ultimate infusion of computer-administered t^sts 
throughout NAEP. In the final chapter, a concept and plan are 
described for an Educational Assessment Institute . Primary type of 
information provided by report: Program Description (Operating 
Policies); Procedures (Conceptual). (Author/PN) 



\ ■ ■ 

icicic iric iciciciciciciciticiciciciciciciticicieiciciciciciciciciciciciciciciciciclticicicicicic 

* Reproductions supplied by EDRS are the best that can be made * 

* from the original document. * 



AIR-25900-11/82-FR 




AMERICAN INSTITUTES FOR RESEARCH 
IN THE BEHAVIORAL SCIENCES 

P.O. Bo^ 1113, 1791 Arastradero Rd.,Palo Alto. Ca. 94302 • 415/493-3650' 



Development of the National Assessment 

of Educationar Progress 



David A* Brandt 
John G. Claudy 
Kevin J. Gilmartln 
Steven M. Jung 
Donald H. McLaughlin 
Sandra R* Wilson 



UM. DEPARTMElirr OF EDUCATION 
NATIONAL INSTITUTE OF EOUCATION 

EDUCATIONAL RESOURCES INFORMATICJN 
CENTER lERICI 
Thii document h«» b«en r«produc*d •« 
received from th« person or organizttion 
origmating tt 

Mmof change* hav« been made to improve 
reproduction quality 

• Points of vMw or opinions stated m this docu 
ment do not nocelsariiy represent official NIE 

position or policy 



This work was done under Contract No. 400-82-0015 with the National Insti- 
tute of education. Department of Education. The content does not neces- 
sarily reflect the position or policy of either agency, however, and no 
official endorsement should be "Inferred. 



November 1982 



V 



An Equti Opportunity Employer 



TABLE oiF CONTENTS i 

Page 

Preface . , • . iii 

Objectives for a National Assessment of Educational Progress 1 

The Role and History of NAEP Objectives 1 

Current Assessment Objectives 2 

Future NAEP Assessment Objectives 16 

Annual Assessments of Learning Areas 31 

General Specifications 32 

Potential Benefits 1 ..... . 34 

Potential Costs ' 38 

Conclusion , , 41 

^ Measurement Founded on Modern Psychometric Theory 43 

Background 43 

Aptitude versus Achievement 44 

^ Multiple Matrix Sampling 45 

Reporting Results from Individual Exercises ^ 47 

Latent Trait Analysis 48 

Latent Class Analysis ...... ^ 50 

Conclusion 53 

Computer-Administered Testing 55 

Feasibility 55 

Practicality ^ 55 

Recommended Feasibility Study * 57 

I Conclusion ' 60 

Establishment of an Educational Assessment Institute 61 

Introduction • 51 

Potential Functions for an Educational Assessment Institute .... 61 

Institute Organization 67 

Institute Funding 67 

Present Status of Institute Planning • • 68 

Ri(f%rences ^ 69 

■ I 

Appendix A: PLAN Master Objectives (Foreword and Introduction) .... 71 

\ 



LIST OF TAilLES 



Table It. Current Objectives df the National Assessment 

of Educational Progres^ . .3 



Table 2. Suaunary of NAEP Assessment Schedule 



14 



Table 3. Summary of Released Exercises and Estimates 

of Relative Numbers of Exercises Per Assessment 

by Assessment Area • • • • • ^3 

Table 4. Critical Factors Related to the Quality of Life 

of Adults 19 

Table 5. Educational Goals for Elementary and Secondary 

Education as Adopted by State Governments 21 



O 11 

ERIC 



Preface 



The National Assessment of Educational Progress was established with 
several purposes, one of which was to pioneer new methods for filling 
Information needs In education. In 1982, the National Institute pf Educa* 
tlon took such a pioneering step: the funding of five parallel projects 
aimed at producing plans for carrying out NAEP. This step was taken, at 
least In part, because of the dearth of proposals received In 1978 In 
response to the previous announcement of a NAEP grant competition. 

And that step has been effective. The ^erlcan Institutes for Re- 
search and (we assume) Che Educational Testing Service and the National 
Opinion Research Center have prepared competing proposals for NAE? while 
exploring Innovations that can Improve NAEP no, matter who the grantee may 
be. AIR (and we assume ETS and NORC) has reached the conclusion that the 
Inadequacy with which NAEP has been performed by the Incumbent demands a 
fresh approach. 

In our proposal for the planning grant, AIR discussed six Issue areas 
In which we believed substantial Improvements In NAEP might be achieved. 
The unifying themes among these Issues areas were Increasing the visibility 
of NAEP» Its relevance to policymakers, and Its utility to state and local 
^ education agencies. This report expands five of those areas. We have not 

devoted a chapter to the slxjth because, upon further consideration, the 
Improvements appeared straightforward and not In the least problematic. 
Our efforts In each of the other five areas represent Initial steps, to be 
expanded upon by the NAEP grantee. We feel each of the Innovations w^ 
describe Is essential for the future health and productivity of NAEP. 

The first of the five substantive chapters of this report deals with 
objectives. Sandra Wilson, Director of AIR's Medical College Admissions 
Test effort, carefully examined the NAEP objectives and compared their 
characteristics with the objectives of MCAT, which she developed ^through a 
procedure that can be applied to NAEP. Dr. Wilson points out the critical 
^ need for an overall framework for NAEP objectives and outlines a plan for 
surmounting the difficulties engendered by bringing up the topic of 

O ill 

ERIC o 



"reflnlag objectives/' Thtf^roundwork that John Flanagan laid, through 
ProJect\PLAN and through his study of educational goal$, provides a good 
starting point for this effort. 

The second substantive chapter deals with the design of test adminis* 
tration and focuses on the co^ts and benefits of a unified, integrated 
assessment given each year. Don McLaughlin, Director of AIR's NAEP Plan*- 
ning Grant effort, describes one Scenario for NAEP and points out that 
assessing each area each year (in each booklet, even) will improve the 
responsiveness, power, and research utility of NAEP at little or no cost. 
These conclusions are based on the experience of AIR in policy research 
and on the expertise of our Social Indicators Research Program staff. 

In the next chapter, David Brandt, who has recently joined AIR from 
the University of Chicago Behavioral Science Department, agrees with 
Darrell Bock that exercises are the wrong unit of analysis for NAEP but 
goes further to compare latent trait and latent plass approaches to the 
development of meaning in the assessment. VHiile both have their place, 
David points out the special applicability o^ latent class analysis for 
achievement indictf^. In any case, it is essential, he argues, to modify 
the NAEP matrix sampling design to allow estimation of scores across 
booklets. 

The fourth substantive chapter discusses the dream of computer-admin- 
istered testing in down-to-earth terms. Dr. John Claudy, former Director 
of the Project TALENT ^atra Bank and currently a senior staff member in 
air's Measurement, Analysis, and Utilization Group, reviewed existing 
studies of computer-based testing, sought out predictions of future tech- 
nological advances, and proposes a gradual series of studies aimed at the 
ultimate infusion of computer-administered tests throughout NAEP. 

In the final chapter. Dr. Steven Jung, Director of AIR*s Institute 
for Analysis of Educational Policy, describes AIR's concept and plans for 
an Educational Assessment Institute. We believe this to be a special 
aspect of air's planning for NAEP, and our partner In the development of 



the concept, the Stanford University School of Education, has specla 
qualifications for managing such an Institute. 



The chapter we did not develop, on modifications of the sampling 
design to make estimates f or ^ variety of target groups and Issues as they 
arise, was to have been wrlt*n by Or. Lauress Wise. He examined the 
Issues Involved and attested to the fact that there were really no Impor- 
tant Issues— of course It could be done. We did reach the conclusion that 
unless somebody else Is busy producing a school district file from the 
1980 Census, It might be appropriate for NAEP to tackle this effort for 
the school districts contained In sampled PSUs. 

Others who worked on this effort Include Paul Schwarz, President of 
AIR, William Clemans, Vice President and Director of AIR's Palo Alto 
office, Robert Krug, AIR's Director of Research, and senior research staff 
Including Barbara Bessey, Bob Rossi, Darlene Russ-Bft, Terry Armstrong, 
Laurie Wise, and Pattl Bourexls. John Flanagan provided guidance, espe- 
cially for the discussion of objectives. Kevin GUmartln edited the text 
and put together the planning grant report. 



Donald H. McLaughlin 
November 1, 1982 



ERIC 



Objectives for a National Assessment of Educational Progress 

The Role and History of NAEP Objectives 

The basic purpose of the National Assessment of Educational Progress 
is to collect data and report over time on the performance of young Ameri- 
cans in reading) mathematics, and communications; to conduct assessments 
in other subject areas as the need arises; and to provide state and local 
educational agencies with technical assistance in interpreting assessment 
results and conducting their ovm assessments. In attempting to implement 
the first two purposes, NAEP selected ten major learning areas as its 
original focus, and within each of these areas it developed lists of 
specific objectives on which performance was to be assessed • As the 
assessment has developed, certain areas have been combined such that there 
are now eight assessment areas* The NAEP objectives form the heart of the 
assessment, and the set of objectives in each assessment area are the end 
result of extended deliberation by subject matter specialists, educators, 
and concerned lay persons over more than 17 years — as advisors, consultants 
to NAEP, contractors, or as NAEP staff. These objectives **form the frame- 
work for the learning area assessed" (NAEP, SY-OI-36) , encompassing Sfehe 
knowledge, skills, understandings, and attitudes that are to be assened 
in each area. ^ 

Because of their centrality, it is essential that these objectives be 
examined as they are now, after having been hammered out in a long, ardur 
ous, and expensive consensus process, and then we need to consider what 
issues have arisen and might arise with respect to the objectives in the 
future and how these might best be addressed. 

mm 

Before considering the objectives themselves, however, it is impor- 
tant to realize several general features about them. The first is that 
the development of objectives has been carried out with a high degree of 
independence from one assessment area to the next. Hetice, each set of 
objectives has been subject to the prevailing perspectives of experts In 
that content area as to the structur$_ji)4 the desirable level of detail 



and as to how objectives should be made appropriate for the different age 





groups to be assessed. Within a given area, objectives (and exercises) 
have been required to meet at least one of the following criteria, namely, 
that they 

• be considered important by scholars in the discipline in 
question, 

• be acceptable to most educators as desirable teaching goals 
in most schools, and 

• be considered desirable by thoughtful lay citizens* 

A point worth noting is that these "criteria" are not so much criteria 
as they are a requirement that the ratification of certain constituent 
groups be obtained if an objective is to be included in any assessment* 
Another point to note is that the NAEP objectives have not gone without 
serious criticism* Greenbaum, Caret, and Solomon (1977), for example, 
have severely criticized NAEP's approach to setting objectives, noting 
that the discipline or subject-matter assessment abroach has. serious 
limitations* Furthermore, many concerns voiced during the original devel- 
opment process were never directly addressed but simply set aside as the 
highly political process surrounding the formulation of objectives went 
forward* 

Current Assessment Objectives 

NAEP's assessments of content area^ have been performed separately, 
and similarly the assessment objectives have been developed and published 
separately. This practice makes it difficult to review the NAEP objectives 
as a whole and to interrelate objectives across content areas. Table 1 
contains an outline of current ECS/NAEP assessment objectives based on the 
eight most recent NAEP objectives booklets: 

• Reading and Literature Objectives: 1979-80 Assessment (1980) 

• Writing Objectives: Second Assessment (1972) 

/ 

• Mathematics Objectives: 1981*82 Assessment (1981) 



Table 1 



Current Objectives of the National Assessment of Educational Progfeds 



UADl2fG AHD LITElATtJIE 

K* VaItM« thm teiMfiCi of raadiot for cha iodivlduAl 

!• Eteognisai chac rtftdint can b€ • tourea of 
•ajoTMac; d«ttOO«cracai • cowIcmqc co 
raadiof for aajoyMnc 

• Do icudaaci f««l chac tomm raadiot mithc 
b« p«rsofuilly •aJoTsbLa? 

' • Do icudaaci Idancify raadintp aaooi ochar 
•eclTlciai, • aourea of •qJotmqc? 



1. Cottprahaoda PropotitiooAl Mlaciofwhipt 

• C«o tcudaoct uodartcaod propotlciooal 
ralACiofifthlpa'-*f\ieh cauiallcx, 
casporaliC7t *od iascruMacallcy— chac ara 
ela«rl7 tcatad la • parasrtph? 

• C«a tcudaact uodartcaad propoticlooAl 
r«lacioashlp«'-~tueh eauMlityi 
cmporaliCTi aod iatcruMac«liC7-*chae art 
iapliad la • paragraph? 

Coaprthaadt Tax^T&jL^lacioathipt 

a Can tcudanct iafar cha tiaia idaa or 
purpoaa of a caxc? 



a Do icudaaci ipaod ciaa raading for 

anjoywaac? tfhac do chay raad? Hov ofcaa? 

2. Meofnlsaa chac vriccaa aacariali can 
eoQCribuca co parional grovth; daaooicracai a 
eoMlCMQC CO raadiat ai ona Mana of 
davaloplng aalf-uodaricandioc 

a Do icudanci chink chay aighc laam abouc 
chaaialvaa and ochari chrough raading? 

a Do acudaaca raad for chair ovn paraooal 

grovth? 

3. Meognizaa chac raading can ba a oMana of 
acquiring knotfladga and iolving probla«a; 
dafloaacracaa a eoaaiUMnc co raading at a 
oaana of acquiring iuiovladga and aolring 
problasa 

• Do acudanci chink chac raading aighc ba • 
▼aluabla iourca of inforaacion? 

• Do acudanci raad co gain knovladga and 
aolva problasi? 

B« Appraciacaa Cha cuXcuraX rola of vriccan 

diacouraa ai a vay of cranaaiccing, iuicaining, 
and changing cha valuai of a iociacy 

• Do icudanci racognlsa chac vriccan 
sacarlaXi and iociacy iofluanca aach ochar? 



a Can acudanca uodartcaod cha characcar, 

aoodt chaaa, or aaaoing in a caxc? 

• Can acudanca uodtrtcaod varloui 

axplanaciooa for tcacat or aranci? 



III. tagpoodg CO vriccan worka in incarpraciva and 
avaXuaciva vayg 

A. Cxcandt uodaracaoding of vriccan vorka chrough 
incarpracacion 

> 1. Daaooacracat avaranaaa of aaocional iapacc of 
^ vriccan tforka 

a Do •cudanct axparianca aaocion in 

raapooaat co vriccan vorka incaodad co ba 
funny I tad, provocaciva, and io on? 

a Can tcudanci raXaca chair aaociona co cha 
purpota and Maning of i vriccan vork? 

2m Appllaa pariooaX axpatiaoca co vriccan worka 

a Do icudanca racognisa raXacionihlpi 

bacwaan chair ovn axparianca aad toaaching 
chay raad? 

• Can iCudknci affaccivaXy apply parional 
azpariancii co whac chay raad in ordar to 
daapan chair undaricanding? 



• Do acudanca support cha vriccan axprattion 
of diffaranc viavpoinca? 



II« Coiiprahaoda vriccan vorka 

4. Co«prahanda vorda and laxical raXacionahipa 

a Can acudanci uoiaracaod cha naaning of 
vorda vhan uaad in cha concaxc of vriccan 
■acariaX? 



3. Appllaa koovladga of ochar vorka or ochar 
fialda of acudy 

a Can acudanca raXaca vhac chay raad co 
ochar vorka? 

a Can acudanca ralaca koovladga of ochar 
fiaXda of acudy» txich aa hiacory, aclaoca» 
or phiXoaophy» co vhac chay raad? 

4. AaaXytaa vriccan vorka / 



ERLC 



• Can acudanca uodaracaod figuraciva and 
IdioaMCic aaanlaga of iiorda? 

a Can acudanca uodaracand caaa raXacionahipa 
auch aa aecor, aecion» and racipianc? 

• Can acudanca undatacand anaphoric 
raXacionahipa baewan vorda and chair 
raiaranca? 



10 



Can acudanca idancify cha formal acruccura 
of a vork and aaa hov chac acruccura 
coocribucaa co cha naanlng of cha vork? 

Can acudanca idandly licarary davieaa and 
aai^hov chaaa daTlcaa concribuea.co tha 
■aanlng of cha vork? ' 



-3- 



!• Cv«Lu«c«« tfrlccao Worka 

• Hh«c eric«rl« do •cud«oe« a«« co •v«lu«e« 
pomm$ and aeorlaa? 

• C«Q acudaata applT appropriaca erlearla eo 
aTaluaca a broad rsnfa of vrieeao wrka? 



IV. Appliaa acudy tkilla in raading 

A* Obcaiaa Loforucioa fro« nooproaa raading 
faeilicacors 

a Do acudaaca uaa viauAl aida whan raadinf? 

a Can acudaaca eorraeclT iocarprac 

lafotmacion fivaa on a chare, «ap, or 
graph? 

!• aaaa cha varloua psrts of a book 

• Do aCudaaCa uaa diffaranc parta of a book 
CO find ioforucion? 

a Can acudanca uaa cha diffaranc parca of a 
. book CO find apaeifie InforKaCion? 

C» Obcaiaa loformaclon fra« Mcarlala comooIt found 
In llbrarlaa or raaourca eancara 

• Do acudaaca uaa varloua rafaranca 
■acariala? 

a Can acudanca find apaelfic inforucion 
fro« rafaranca aacariala? 

0* tJaaa varioua acudy caehnlquaa 

• Do acudanca uaa varloua caehnlquaa co aid 
chair actidylng? 

• Can acudanca adjuac chair raading rajcaa 
" dapandlog on chair purpoaa for raading? 



WlUTIIiC 

I. D— onacracaa abilicy co ravaal paraooal faalinga and 
Idaaa 

A* Through fraa axpraaaion 

1* Through ch^ uaa of eonvancional aodaa of diaeouraa 

II. Paaonacracaa abilicy co wrica in raaponaa co a wida 
ranga of aociacal dMiAnda and obllgaciona . Abilicy 
ia dafioad co iocluda eorraecnaaa in uaaga, 
punccuacion, apalling, and fora or eomrancion aa 
appropriaca co parcieular wricing caaka, a.g., 
■anuacripca, laccara. 

A. Social 

Faraonal 

2. OrganlsACional 

3. CooBunic/ 

I* Bualnaaa/voca clonal 
C« Scholaacle 



III. Indlcacaa cha laporcanca accachad co vrldng tkllla 

A. taeognlxaa cha nacaaalcy of vrlcing for a variacy 
of oaada (aa in I and II) 

I. Vrlcaa co fulfill choaa naada 

C. Gaca aaclafaec\on, avan anjoyvanc, fro* having 
vrtccan aoMChing wall 



MATHEHAXICS 

COCWITIVt DOMAIN 
I. Concanc 

A. Muabtfra and nuaaraclon 

1. Nuaaracion (vhola mmbara, fracciona, 
daciaala, pareanc, incagara* aciancifle 

aoCAClon) ^ ' " 

2. (fuBbar eoocapca (vhola niaibara, fraecloma» 
daelaala* pareanc, ineagara) 

3. Oparaciona (vhola nuabara, fraeciooa. 
daclmala* parcanca, Ineagara) 

^* ^^Cal eo«fucacion 

S* Eaeiaacion 

6. Proparciaa ^ 

7. RalACiona i 
I. Variablaa and ralacionahipa 

1. Facta, dafinielona, and ayvbola 

2. Uaa of varlablaa in aquae iona and 
ioaquAilclaa (aoluCiona* aqulvalaocaa* amd 
ctaoalaciona) 



^ 3. Oparaciona wich varlablaa 



ERIC 



4. aaa of varlablaa^ co rapraaanc alaaanca of a 
nuabar ayacaa 

5. Funcciona and foraulaa 

6. Coordinaca ayacaaa 

7. fxponancial and crigonoaacrie funcciona 

8. Logic 

C. Shapa, alxa, and poaicion 

1. iaeognlcion of figuraa 

2* Conacrueciona and dravlnga 

3. Viaualitkcion (acacie and dynaaic) 

4. ftaeognicion of ralacionahipa (eongruanca, 
aiallaricy, and ayaaacry) 

3. Idancif ieacion of proparciaa froa givan 

vlaual iofonucion vlchin* bacvaan* or aaong 
figuraa 

6. Ralacionahipa involTlng elaaaaa of figuraa 

1 J 

^ a. 7. Dafinielona, poaculacaa, and chaoraaa 

(raeall, infaranca, and applicacion) 



-4- 



I 



ac 



1* Ualc (•pproyrlAca ilia tnd tyf of uaic, imit 
•qulvAlaaci, eoovariloas wichio • tffm) 

1* laicnMac r««dlii« (Cagllib tod Mcrle 

ruXari, tcaiai* cb«nKi«scart» elocki, ace*) 



3* LlfM«r M«iur« (lacludlaf aoast4nd«rd uolci) 
4* 4r««, p«rlMC«r, •M wolmm 
S* Praeliioa 

6* CiClmacloa of M«iur«Maci 

C* ProtMblllcy tad ic«cli(tles 

1. Orgaolslag, dliplaylag, and Incarpraclaf 
Inf^nuicloa (calllai, (raphi, ch«rct, and 
cablai) 

2. Maaauraa of caocral caadaaey (aaaa, Mdlan, 
aoda) f 

3« Haaauraa of apraad and poalcloQ (ranga, 
parcaaclla, acandard davlaclona) 

4. Saapllng and polllnt 

3* Probablllcjr (alapla, eoapound, and 
Indapandaac avanca; odda) 

6. Coablnaelona and paniucacloaa 

F* Taehnolog/ 

1. Hand calculacor 

2. Cospucar Ilcaraey 

II. Procaaa 

A. Machaaacleal kno%rladga 

a Hotf vail can acudanca racall and raeognlsa 
fa^ca, daflniclona, and ayvbola? 



ERIC 



Hachasacieal aklll 



a Hotf vail can acudanta parfora 

papa r*mad -pane 11 eoapucaclon, Ineludlos 
eovpucaclona vlch tftwla ffuabara, lacagara, 
fraeclona, daelaala, pareanca. and racloa 
and proportlona? 

• Nov wall can acudanta parfoni algabrale 
■anlpuXaclona? 

a Rov wall can acudanca parfora gaoaacrle 
aanlpulacloaa Ilka eonacrucclona and 
apaclal vlauallxaclons?, 

a Ikni vail can acudanca aaka aaaauraaanca? 

a Rov wall can acudanca raad grapha and 
cablaa? 

a Hotf wall can acudanca eo«ptica acaclaclci, 
probablllclaa, or cottbloacloaa? 

a iov vail can acudanca parfon aancal 
coapuCAClona, including coapucaciaa vich 
who la mabarai fracciona* daeiMia» and 
pareanca? 

a low wall can acudanca aaciMca cha anavari 
CO coapucacioaa aad aaaauraMnca? % 



a lov vail can acudanca parfon co«put«cioQa 
involvlof tihola ouBbara. daclaala* 
fracdooa* aad pareanca uaiof caleulaeottt 

Bon tfall caa acudanca raad flav charca or 
baaic coafucar proftau? 

C* Machasacical undaracandiaf 

a Bov tfall can acudanca craaalaca a varbal 
acacaaanc inco ayabola or a fifurat and 
vica ^araa? 

a Baw vail do acudanca undaracand 

aacbaaacical coaeapca and prloeiplaa? 

a Bov vail can acudanca aalacc cha 
appropria;ca uaaa of cospucara? 

a Hov vail can acudanca aalacc an 

approprlaca cosputacional MChod auch aa 
papar and paoeil. vancal, aaciaacion, or 

calculacor? 

D. Machaaaclcal applicaclon 

a Bov vail can acudanca aoWa roudna 
casebook problana? 

a Hov vail can acudanca aolva oonroucina 
problaaa? 



a Bov vail can acudanca apply 
problaa-aolviag acracagiaa? 

a Bov f«all can acudanta Incarprac daca 
drav concluaiona?^ 



4 



a Bov vail can acudanta uaa aachanacicat 

including logic, in raaaoninf and aaking 
' judfMnca? 

a Bov vail can acudanca uaa a calculacor co 
aolva applicaclon problaaa? 



^LmcnVI DOHAIB 
A* Accicudaa 



a Bov do acudanca faal abouc ctia aachaaaclca 
chay ancouttcar in acbool? 

a Bov do acudanca faal abauc cha varioua 
acciviciaa In aachaaacic^claaaaa? 

a lav do acudanca faal abaac thair paraaaal 
axpariaoca vich aachaaatleif 

a Whac ara acodanca' baliafa abouc cha 
oacuta of aachaaacica aa a dlaeipllaa? 

a Wkac ata acudaaca* baliafa abouc cha valua 
of aachaaaciea co aaciacyr 

a Vhac ara acudanca' baliafa abouc coapucara? 



/ 



.5. 1 



CITIZENSHIP .AND SOCIAL STUDIES ^ 

Dcaonscraces skills necassary to acquire infotmatloa ^ 

A. Uses Che seases , " - 

3* Uses sources ^ucla as card cacalogues and Indexes, 
case studies^ computers,, drawings, films, globes 
^ndkocher .models,, graphs, maps^ newspapers, 
pho ctriT^ pic Cure s , radio, recordings, reference 
books, slides^ capes, television 

C* Uses cechnlques such as personal Incervlews, 
vrlccen ess«y3» polls, arrd quesclonnalres 



II. Demonscraces skills necessary co use Informaclon 
A^ Organizes Informacton 
3. Applies' Infonnaclon 

C. '4akdS dccisloas an^^ solves problems 

D. Crtcl'cally evalu^ces Informaclon 



III* Demons c races an underscandlng of Individual 

' developmenc and Che skills necessary t>o commonlcace 
vlch ochers y 

A* Examines Individual beliefs, values, and behaviors 

3. Demonscraces Individual developmenc t 

C* Communlcaces In graphic and oral forms 

0. Gives accenclon and reponds Co che expression of 
ochers 

£• Inceraccs In groups In various capaclcles 

Has effecclve relaclons vlch peo{^le having 
different culcural perspecclv^ 

IV. Demonscraces an underscandlng of and Inceresc In che 
ways human beings organize, adapc co, and change 
chelr envlronmencs 

A. Onderscan<(s che forces chac shape Individual 
human beings 

3. Underscands che InC^vrelacedness of human 
soclecles 

C. Underscands che organlzaclon of human soclecles 

0. Underscands che relaclonshlps becween Individuals 
j^. and groups 

E. Underscands che relaclonshlps among groups 

F. Underscands che relaclonshlps becween people and 
che nacural envlronmenc 

Has an awareness of global concerns 

H. Has a commlcmenc co human rlghcs worldwide 

V. Demonscraces an underscandlng of and Inceresc In che 
developmenc of che Uniced Scaces 

A* Underscands che principles and purposes of che 
Uniced Scaces 

Q Underscands che organlzadon and operadon of che 
^1^1^" governoMncs In :he Uniced Scaces 



C. Underscands po lie leal decision making In che 
^bnlted Scaces 

D. Underscands che eleccoral processes In che Uniced 
Screes 

E. ' UnderscandSx^he basis and organlzaclon af che 

legal syscem In che Uniced Scaces 

F* Knows rlghcs of Individuals In che Uniced Scaces 

G. Recognizes civil and criminal judicial syscemt in 
che Uniced Scaces 

H* Has a commlcmenc Co supporc jusclce and rlghcs of 
all individuals 

I* Underscands economi.cs in Che Uniced Scaces 

J. Underscands major $pclal changes chac have 
occurred In American soclecy 

K* Has a commlcmenc co parclclpaclng in communlcy 
service and civic Improvemenc 



SCIENCE ' 



COGNITIVE DOMAIN 



ConcenC 



A. 


SloXogy 


4 








I. 


Germ cheory and disease 




2. 


"Syscemaclcs 




3- 


Cell cheory 




4. 


Energy cransf onnaclon 




5. 


Heredlcy 




6. 


Sysceois 




7. 


Evoluclon 




8. 


Ecology 




9. 


Behavior 




10. 


G^^ch and developmenc 


B. 


Physlcal^sclence 




I. 


Maccer 




2. 


Comblnaclons 




3. 


Me^haal^s<^ 



Waves 
5. Eleccrlclc^ 



4 . Wave s 
5 

Earch science 

1. Mecerology 

2. Geology 

3. Oceanography 

4. Ascronomy 



and magneclsra 



-6- 



13 



0* Incesrac^d copies (mulcldlsclpllnary) 



B* Vocaelofial and educaclooal Ineentloas 



1* 


HodaXa 


2* 


Equl Librium 


3. 


Chaagd 


4» 


^voLudon 


5. 


Grotfch 


6* 


Time /space 


7* 


Systems 


3. 


Cycles 


9. 


Probability 



II. Processes 

A* Process /mec'.iods 
' 1* Models 

2* Assumpcloas 

3* Communlcaclons 

4* Measuremery 

3* Classlf Icacloa 

6* Observacloa 

7* ExperlneaCacloa 

3* lacerprecaclon of daca 
S* Science and soclecaX problems 

1* Health and safety 

2* Environment 

3* Growth andr resource management 
C* Science and self 

0* Science and technology (applied science) 
I* Biological 
2. Physical 
Decision ttaklng 

AFFECTIVE OOMAlM 

A« Attitudes toward science classes * 

• To what extent are science clasees 
enjoyable? 

• To what extent does the student perceive 
individualization In science classes? 

• To what extent do science teachers enjoy 
science and reflect that enjoyment to the 
students? 

e Are science clasaee ueeful? 

e What extracurricular sclencerrelated 
aceivltlet do the etudtnte pursue? 



• To what extent do students consider 
■science ae an area of further^study and 
career posei^llitles? 

e How do students rate a sclence^'related 
vocation? 

Personal Involvement 

• Do students recognize serious problems in 

the world today? ^ 

• Can students effectively do anything to 
^ solve major problems? 

• Are students willing co help solve major 
problems? 

• How often- do students participate In 
activities that aid in solving major 
problems? ^ 

Tools'— attributes 

• r^ii the concepts and principles learned In 
science classes useful or applicable In 
everyday scientific Investigation and 
decision making? In problem solving? 

Confidence in science 

• Determine the attitude/^ students have 
toward the conduct ^nd support (l«e*, 
financial) for applied research, basic 
research 

Controversial Iseues 



• What are students' opinions and attitudes 
about allowing reeearch In areas with 
potential hazards and rleks? 



G* Awareness 

• Are students awcAtji^ th« scientific 
process and the empirical nature of 
science? 

• Are students ^aware of the tentatlveness of 
scientific theories? 

H* Experience ? 

I* Experlence*-'^one something 

e Have students ever done eclence*-relat«d 

activities? 

• Would students like, to do eclence-^related 
activities? 

2* Exjperlence^^seen somethiog 

e Have students ever seen different events 

4\.o^t activit^s related to science? 

• Would students like to see different 
events or activities related to science? 

3* Bxperlface«-niaed something 

• tlave studente ever used various 
eclence-^related objects? * 

• Would students like to use various 
eclence**related objects? 



ERLC 



-7- 



4* Expariaae«**viaitad « place 

• H«v« atudaots avar viaitad vaeloua placaa 
ralatad to aciaaca? 

• Uould studaota liica to viait varioua 
placa| ralatad to sciaoca? 

5* Expariaoea'— doaa axpariaanta 

• Bava atudaata avar donf axparioiaota or 
activitias with vatioua sciaoca->ralatad 

things? 

• Would atudaata lika to do axpariaanta or 
activitiaa with various sciaaci^ralatad 
th^oga? 



MDSIC 



I. Valua iiusic iaportaat as an iaportant raala of huaan 
axparianca 

A. 3a affactivaly raapoos^va to muaic 

B. Ba acquainted vlth auaic ftoa diffaraat oations^ 
cultures, periods, gearee, and ethnic groups 

C. Value laueic in the life of the individual,^ 
family, and conmunity 

D* Make and euppcrt aeethecic judgments about mueic 



II. Perform nueic 

A. Sing (without score) 
B* Play (without score) 
C* Sing or play from a written score 
^ D. Sing or play a previously prepared piece 

III* Create muelcl 
k» lap ro visa 

B* Represent mueic symbolically 
1* Arvaage 
2. Compoee 



V. Identify and clasalfy mueic historically and 
culturally 




IV» Identify che elements and expreesive controls of 
music 



Identify the element e of music 
1> Rhythmic organization 
2» Pitch organization 
3* Tone quality 

Identify the relationehipii of elemente in a given 

compoeitiott 

Oaaohstrate an underetattdiag of a variety of 
■oaical teraa, expraeeion markiiilge, and 
conductiog gaeturee in a musical context 



Identify and daecriba the featuree that 
characterize a variety of folic, ethnic, popular, 
and art mosie 

Identify and daecribe the mueic and muaical style 
of the varius atylistic periode in Ueetem 
civilisation (e.g., medieval, renaissance, 
baroque, cXaeeical, romantic)* Identify 
repreeentativa coapoeere of each period 

Cite exaaplae of vaye^n which man utilizee mueid 
in his social and cultural life 



ART 



ERIC 



I* Perceive atld raa^ond to aapects of art 

A* Recognize and daacrlba the subject matter 
elements of works of art 

B* Go beyond the recognition of subject matter to 
the perception aoc^ description of formal 
qualities and axpreeelve content (the combined 
effect of the eubject matter and the epecific 
visual form that characterizes a particular work 
of art) 



II. Value art ae an important realm of human exparience 

i 

A* Be affectively oriented toward art 

B* Participate in activltiae related to art 

C* Expreee reaeonably eophieticatad concept lone 
about and poeitive attitudae toward art and 
artiete 



I II . Produce works of art* 

A. Produce original and iaaginativa works of art 

B* Expreee vletial ideae fluently^ 

C* Produce works of art with a particular 
compoeition, eubject matter, expreeeive 
character, or axpreeelve content 

D* Produce works of art that contain varioua vletial 
conceptions ^ 

S* Demonstrate knowledge and application of madia, 
toole, r techniquee, and forming proceeeee 

IV, Know about art 

• 

A* Recognita major flguree and worka In the hietory 
of art and undisrataad their 'significance* 
(Significance ae it ie ueed here refere to euch 
thinge as works of art that bagan new etylee, 
markedly infltianced eubeequent worke, changed the 
direction of art, contained vleual and technical 
dlecovarlee, expreeeed particularly wall tha 
eplrlt of their aga, or are conaldered to ba tha 
major works of major artiete.) 

' B* Recognize etylee of art, understand the concept 
of etyla* and analyse works of art on tha baaie 
of etyle 



-8- 



15 



C« Kiiov ch« history of van's are seelvle/ and 

' undartcand cha ralaclon of ona scyla or parlod co 
ochar acylas and parlods 

0* Olscingulsh bacwaan factors of a work of are chae 
ralaca principally co cha parsonal scyla of cha 
arclsc and faccors chac ralaca co cha scylisclc . 
parlod or cha •enclra aga 

£• Knov and racognlza cha ralaclonshlps chac axlsc 
bacvaan arc tnd cha ochar dlsclpllnas of cha 
humanlclas (llcaracura, atuslc, and parclcularly 
cha hiscory of Idaas and philosophy) during a 
glvan parlod 



Hales and juadfy judgaancs ibouc cha aaschaclc tnarlc 
and (^uallcy of works of arc 

A* Maka and jusclfy judgmancs abouc aaschaclc narlc 

3* Maka and jusclfy judgmancs abouc aaschaclc quallcy 

C* Apply spaclfic crlcarla In judging wprks of arc 

0»- Know and undarscand crlcarla for making aaschaclc 
judgTiancs 



CAREER AND OCCUPATIONAL DEVELOPMENT 

Kiowladggi ablllcles, and acclcudas ralavanc co 
caraar daclstons 

A. Awaranass and knowladga of Individual 
ch^raccarlsclcs • 

I* Abilities 

2* Incarascj 

3* Values 

a. Knowladga of caraar and occupaclonal 
characcarlsclcs 

1* Major duclas 

2* Encry raquiraalancs 

3« Work condlclons 

4. ^Banaflcs and amploymanc practicas 

S« Social and cachnologlcal changa 

6* Occupaclonal c lass Iflcac Ion 



klng^fid laplamadclng caraar and occupaclonal 
Islons 



Individual characcarlsclcs and occupaclonal 
raqulra«ancs 

Caraar daclslon (taking 

Caraar praparaclon 

Caraar modlf Icaclon or changa 




II» Knowladga, ablllclas, and acclcudas nacassary for' 
succasa In a caraar or occupadon 

A* Skills ganarally usaful In earaars 

1* Nuoarlcal skills 

2. Connimlcaclon skllls^X^ <' 

3* ' Manual/parcapcuaj/skills -s. 

4. Infomaclon-procasslnSt problan-solvlngt and 
daclslon-vaklng skills 

5* Incarparsonal skills 

6* Eaploynanc-saaking skills 

' 7* Carlar-laprovaaanc skills 

Parsonal characcarlsclcs ralaced co caraar ^uccass 

1* Rasponslblllcy and Inlclaclva 

2* Adapcablllcy co varlabla condlc^^ns 

3* Acclcudas and valuas 

4* Parsonal fulflllnanc 



S* Sourcas of addle lonal knowladga 



ERIC 



-9- 



• Citizenship Social Studies Objectives: 1981-82 Assess- 
ment (1980)/ 

• Science Objectives: Third Assessment (1979) 

• Music Objectives: Second Assessment (1980) 

• Art Objectives (1971) 

• Career and Occup^ational Development Objectives: Second 
Assessment (1977) 

The outlines in Table 1 encompass the NAEP objectives, and they repre- 
sent — in fact, in most cases, they are — the questions that the assessments 
attempt to answer about performance in each learning area. In several 
areas, the bpoklets also contain further detail on objectives or questions 
j wl;thin the finest level of the outline shovm. Thus in science, in career 
' and occupational development (COD), and social^studies-citizenship, there 
/ is a great awUnt of additional detail. In science, this takes the form 
of "sample" objectives that are specifically said not to be definitive as 
^ far as the potential assessment exercises are ctoncerned. COD objectives 

are stated in a quasi-behavioral form and in extensive detail, for example 
"Understand one's own abilities relative to those of others," and "Know 
where to find information regarding job openings." Each objective is 
further described as to its appropriateness and the sophisticatibn pre- 
sumed of persons at each age level. In writing, there is also ddditional 
detail by way of examples of the kinds af writing that might be done by 
persons at each of the age levels, but there is very little specificity as 
to how ability to write is really to be evaluated or scored at each age 
level. , 

The NAEP objectives are obviously not specific enough to uniquely 
define particular exercises as the appropriate measure of performance on 
the objectives as a whole. As we will elaborate later, they lack specifi- 
cation of the stimulus condition to hje presented to the student or the 
criterion by which mastery is to be evaluated. NAEP exercises can be 
referenced to an objective they are supposed to assess, but many exercises 
with quite different skill requirements* (and hence different performance 
outcomes) can be and have been developed and referenced to the same objec- 

O -10- 17 

ERLC 



T 



tlve. NAEP exercises are not criterion-referenced. They can be Said to 
be objectives-referenced. However, the converse Inference, namely that 
the level of performance on a given e^,?cerclse can be Interpreted as Indica- 
tive of the level of performance on thfe objective as a whole. Is generally 
not warranted , even when the;, most specific NAEP objectives at the most 
detailed level are considered. This Inference, of course, has been fre- 
quently made, and we will have ^ore to say on that subject In the section 
on exercise development. 

When the various sets of a^ssessment objectives are arrayed In proxim- 
ity, as In Table 1, It Is difficult to avoid beln^ struck by their very 
considerable differences. Objectives In the various areas are not organ- 
ized by any consistent framework, and they are phrased very differently 
(e«g*, as questions^ a^s quasl-behavloral statements, or as topics). The 
objectives also differ widely In specificity from area to area and In how 
they treat cognitive versus affective objectives (I.e., whether affective 
objectives are Integrated Into the substantive material (e.g.. In music, 
art, and social studies-citizenship) or kept as separate objectives (e.g., 
in science and math). The areas also differ in whether and how they 
. recognize "process" objectives or objectives at differing levels' of Bloom's 
taxonomy (e.g., the differences between recall c5r recognition of facts or 
principles in the subject area and the ability tO' apply these principles 
to the solution of problems or the evaluation of new situations). It is 
also evident that there is overlap 'between the areas. Applied mathematics 
objectives occur among the COD objectives, for example, and problem solv- 
ing and thinking/information utilization skills are included in sevetal 
areas. For the most part, these are linked to the subject matter of .the 
area in question, but in COD, for example, they are treated as more gen- 
eral skills of use in all sorts of employment settings. 

Finally, there are types of objectives found in each list that differ 
in terms of the aspect of the student to which they refer. Since they are 
not treated consistently from list to list, iheir similarities are harder 
to discern. Thus, objectives related to the Individuals' capacity (what 
they can do when asked and appropriately motivated) are not readily dis- 
cerned from objectives that relate to habitual response patterns (that is, 

13 



what the Individual comnonly does outside the assessment situation, for 
example in leisure activities )» or from those that relate to what the 
person's Intentions or dispositions might be with respect to future choices 
or in hypothetical situations. 

An issue worth addressing at this point in NAEP's history is whether 
a conceptual framework can be developed to integrate and organize the 
several assessment areas. That this is feasible without doing violence to 
the separate disciplines has been demonstrated in other objectives-based 
assessments dealing with the same content areas and with all grades, 
(e.g., the PLAN Master Objectives , Westinghouse Learning Corporation, 
I97I). That this would have advantages in terms of communication with 
NAEP's audiences about NAEP objectives is fairly obvious. What is perhaps 
less obvious is that such a conceptualization could suggest other ways of 
assessing objectives or of clustering objectives and exercises for report-- 
lag purposes. Some consistent organization could also help identify any 
types of objectives that have been emphasized or omitted from particular 
assessment areas and could help suggest the most efficient and effective 
ways of measuring performance on certain objectives. 

As noted above, the precise nature of the objectives in each assess-* 
ment area has been determined primarily by the subject matter experts, 
educators, and lay public, with the selection of these persons, the timing 
of their work, and the process by which their input is secured being 
determined, at least at the policy level, by the Assessment Policy Commit-- 
tee (APC). The APC also sets the schedule and magnitude of the assessments 
In various areas. In reviewing the current NAEP objectives, we could only 
identify two areas, science and math, where there was an explicit plan for 
a differential emphasis between subareas or sets of objectives within a 
subject area. This differential emphasis is reported in the form of a 
matrix that indicates the percentage of exercises by age and content 
(e.g., number and numaration 40Z, 40Z, and 3SZ at ages 9, 13, and 17, 
respectively)* In areas where the objectives have been either substan-- 
tially revised or combined across two previously distinct assessment 
areas, it seems likely that the emphases among objectives have also changed 
over time and been reflected in differential numbers of exercises refer-- 



enced to particular objectives* It Is not clear whether there have been 
formal. Informal, or no priorities for assessment of particular objectives 
In most of the asse^smerft areas* f 

• 

We have attempted to Induce the relative priorities among content 
areas by examining the NAEP assessment schedule, the numbers of released 
exercises In each area, and an estimate of the total number of exercises 
in each area* Table 2 shows the schedule of assessments by year and 
assessment area* Reading-literature (Including the planned 1984-85 assess- 
ment), mathematics, science, and writing (Including the upcoming assessment 
in 1983-84) have been most frequently assessed, with social studles-cltl- 
zenshlp, art, and music next In frequency* Career and occupational devel- 
opment has only been assessed once, perhaps due to the cost of administra- 
tion of the exercises originally developed in this area* 

Table 3 prbvldes another perspective on priorities* Assuming that 
roughly half of bhe objectives in each assessment have been released and 
that the repeat of materials in successive assessments is at roughly the 
same rate in all areas, one can estimate the average number of exercises 
per area per assessment cycle and from this obtain the percentage of all 
NAEP exercises in a given area in a full assessment cycle (l*e*, after 
assessing each area once)* These numbers are undoubtedly imprecise due to 
the unknown amount of duplication in exercises across assessments, but 
they generally accord with the priorities evident from- the assessment 
frequency* However, social studies-citizenship appears to have relatively 
more exercises per assessment than its priority for assessment frequency 
would suggest, and writing, which has been frequently assessed, has relied 
upon fewer different exercises* The latter is presumably due to the fact 
that the writing exercises require production and are primarily in open- 
ended rather than multiple-choice format* They also are the longest 
exercises, requiring up to 20 minutes, whereas the average is about one 
minute (a range of about 30-90 seconds) for other items* Thus, if onfe 
were to look at priorities in terms of time devoted Co the gathering of 
data in each area, the results would probably be more in line with the 
priorities Inferable from the relative frequency of assessments in each 
area* 



-13- 20 



Table 2 

Summary of NAEP Assessment Schedule 




Year 



Assessment Area 



Reading/ 
Literature Writing 



Math 



Citiz^ship/ Career and 

Social Occupational 
Studies Science Development 



Art 



Music 



Special 



4S 

I 



1969- 70 

1970- 71 

1971- 72 

1972- 73 

1973- 74 

1974- 75 

1975- 76 

1976- 77 

1977- 78 

1978- 79 

1979- 80 

1980- 81 

1981- 82 

1982- 83 

1983- 84* 

1984- 85* 



U 



R,L 



' SS 



W 



M 



M 



M 



U 



R/L 



M 



U, 



R/L 



Mu 



COD 



C/SS 



Mu 



C/SS 



S2., 



ERIC 



* Planned 

*f Basic life skills (17-year-olds only), health and energy (adults only), reading and aclence (adults only) 
•H- Consumer skills (17-year-olds only) 



Table 3 

Summary of Released Exercises and Estimates and Relative Numbers 
of Exercises per Assessment by Assessment Area* 



Area 



Nuaber of 
(Number of Exercises 



Area ai* Average* Z i>f Exerctneii X of 

X of Total NiiMber of in m Full Kxerclneii 

RelcusciJ ExerclMCM per Atfueminenc in Miilciplu 

P.xerL-iseu AnrtenHMcnt** CycU* Choice KornaC 



Art 

Scu'lai Stiitlleti/ 
VAt i^ensliip 



(2) 
(2) 



112 
344 



5X 
I6Z 



112 
144 



6Z 
1 92 



93Z 
^8Z 



Cureer and ( krcu pa- 
pa t lt>iial Develupnunf 



(1) 



61 



iZ 



122 



71 



21Z 



Mailt 

Mucjlc 

Read Ing. 
Literature 



(J)*** 

(2) 

(2)*** 



494 
155 
275 



2iZ 
7Z 
I JZ 



J29 
155 

27^ 



I HZ 
9Z 
I5Z 



56Z 

7az 

79% 



Si. 1 iTit e 



Wr It 111 



(i) 



599 
9(J 
2/rH) 



2 81 
/•Z 

99Z 



199 
60 
'796 



22Z 
iZ 
^99Z 



9JZ 
17Z 



Special Prob li'Ni'i 



iU6 
2 , 4^)6 



96Z 



* Data taken fro» SuMHiary Table of Natliin;il Autiendmont Keleatied KxerclMUt* (IU/81), enclosure In NAEP 
piiblli al lull SY-()1- )6. 

** AuHiiMeu that approximately 50Z of 'exurrlncB used have been releaiicd. Kutlmnti* equals 2 x (number 
/thlea»(*d) divided by number of aMMennmenm; It doew not lake uctuunt til overlap Ln exerclMeM between 
Oidl i eM«i^4 Vf auueuumentM, which would reduce the re^iultant utitlmaletj. 

*** huNMutry table of released exerclueti doeu not IndUate any relcatie of excri*l£ie*i for Reading 1974-75 « 
Math l9HI-ti2, or Si'Uiice 1981-82, lience the ni»mbi?r of additional It emu uned In thene auMeiiMmenCii WOm 
iiol «lt*t«*riHln4ible. * 



2,] 



ERIC 



COPY AVAILABLE 



Future NAEP Assessment Objectives 



The current NAEP assessment objectives In each area are the primary 
embodiment of the many highly specific questions that the assessment has 
undertaken to answer* The objectives (and exercises) may be aggregated In 
various patterns to answer a variety of additional questions, but It Is 
clear that there must be considerable constancy In the basic objectives If 
progress Is to be measured meaningfully over time. On the other hand, 
periodic review and revision of the objectives Is needed to ensure that 
they remain relevant, and the discussion above has pointed out several 
aspects of the objectives that could be substantially Improved, namely 
their consistency across areas, their specificity, and the matter of 
explicit priorities for Inclusion of Items In each area when It Is 
assessed* 

r 

The Assessment Policy Committee Is responsible for setting priorities 
for NAEP assessment, which Includes responsibility for approving the 
procedures used to review and revise objectives, ratification of the 
resultant objectives for use In a particular assessment, and establishment 
of the assessment schedule. Normally r'TTVlew of the objectives takes 
place prior to their use In an'assessment In that area. However, we 
recommend that In the future the APC provide guidelines to the subject 
matter^consultants and to the Area Advisory Committees so as to better 
Integrate their efforts and arrive at a more consistent structure and 
format for the objectives across areas. We also recommend that specific 
efforts be undertaken to provide empirical data on educational objectives 
to^..4U^^t)MrnAPC In Its deliberations, especially In the matter of setting 
priorities. Finally, with the concurrence of therAPC, the NAEP contractor 
should provide a more formal structure to the process by which the Input 
and consensus of a broad sampling of educational and lay constituencies Is 
obtained concerning assessment objectives, accomplishing this without 
requiring the time and expense of a procesik composed primarily of commit- 
tee deliberations* 

Sample Issues * The following Is a brief discussion of some ot the 
Important issues that w see in conjunction with NAEP objectives. 




1. The need for a conceptual fraaework to relate the various 
assessment areas and objectives to each other, to (;he 
broader goals of the assessment, and to the plans for 
analysis and reporting * 

NAEP Is In need of a clearly articulated conceptual framework that 
would describe the general dimensions -on which It Is Important to assess^ 
educational progress and relate these, In progressively more specific 
hierarchical fashion, to the assessment of major subject are^s and to the 
objectives within each area. The framework should also make It clear how 
other Information gathered by NAEP Is to be used to assess educational 
progress. That is. It should specify what analyses will be done to com- 
pare performance between various populations or areas* And finally, the 
framework should clarify what aspects of education and educational progress 
are currently adequately addressed by data from other sources (e.g., NCBS 
data on educational resources) and for what aspects there Is no consistent, 
adequate data base. Whll^e the originators o^ NAEP may have felt that they 
had such a framework In mind. It has not been explicit and accessible. 
Its absence Is part of the reason that NAEP does not communicate well to 
any audience. As times change, as other data bases develop, and as the 
needs for particular types of Information become clearer (e.g., on courses 
offered and course enrollment), the absence of an explicit conceptual 
framework for NAEP makes It difficult to define or redefine the role of a 
national assessment appropriately. We propose that the time of transition 
Is an appropriate, If not long overdue, time for the APC to consider this 
Issue. 



ERIC 



One general conceptual framework that might be considered as an 
option for NAEP would be to begin with two very general purposes of educa- 
tion: (1) preparing students at each grade level to pursue and profit 
from subsequent education at Che next level and (2) preparing students for 
productive and fulfilling roles In society. NAEP's purpose, then, might 
be expressed as that of assessing the knowledge, skills^ understanding* 
and attitudes of students at various grade levels In relation to the 
requirements for attaining these goals. From a framework then would 
naturally follow the criteria against which object Ives, might be reviewed 
at each grade level, naMly» tl^e extent to which students' progress in the 

-17- 27) 



next grade level or phase of their educatlon^owld be Impaired^ by a failure 
to master a given objective, and the extent to which their proaucfelvlty 
and fulfillment In social roles outside of school would be' iilp#lred by a 
failure to' master the objective. Such a fji-amework would also provide a 
focus for Integrating empirical ^evidence , not Just opinion. Into the 
specification of objectives. Empirical data exist on the predictive ^ 
relationships between prior knowledge, skills^ and understanding and 
subsequent academic progress. There Is evidence, albeit less extensive, 
on the knowledge and skill requirements of various adult roles. 

This Is hardly a new concept. As early as 1950, John Flanagan and 
others were proposing the empirical definition of "crltlca^l requlireaents'* 
as the primary basis for extabllshlng educational goals and asseaslng 
progress (e.g., Flanagan^ 1950). This framework would alto suggest a dual 
focus for reporting of assessment results-- on the one hand In relation to 
.students' preparation for subsequent education and on the/ other hatid In 
relation to the performance requirements of various adult roles, for 
example work, citizenship, health, as a consumer, and as a spouse and 
parent. A number of educators concerned about adult knowlfdge have con-* 
ceptuallzed the general areas In which knowledge' Id applied In ways that 
may be of use to NAEP (e.g.,^he Northcutt study conducted at the Unlver-. 
slty of Texas, 1975). An alternative conceptualization of the relevant 
areas of adult life for which specific knowledge, skills, attitudes, and 
so on are required — areas for which a«hqols attempt (In varying degrees) 
to prepare sludents and In teras of I whlctu NAEP objectives and exercises 
might be evaluated and results from {^EP reported — are the five critical 
areas and 15 component dimensions or factors relaj^ed to the quality of 
life of adults as developed by Flanagan and Russ-Eft (1975) and Flanagan 
(1978) and shown In Table 4. 

Obviously, different schools and different "schools of thought" tend 
to place more responsibility on preparati^on related to some areas than to . 
others. The struggle to arrive at a set of general goals for education or 
^ven a framework for stating goals has been a long and arduous one, as 
Flanagan has pointed out (1978). But considerable progress has been madi^ 
notably in the years since NAEP began and in som# tteaaure aided by the 



Table 4 



Critical Factors Related to the Quality o£ Life of Adults 



mszcAL ^ssuL heaur ocmoRissrr 

A. Haaieh and ptgaooaX sai«ey (98Z)* 

£ajoylas frMdoK^coR slcknm, pcM««ssia« physl- 
. c«l' and sttneal fleotss, ovoicUag aceldtacs and 
ocbtr tic«ieb iuuards. ?rcbl«as rataead so aleor 
hoi, druss, dMctif and aglof art aXao laeludad. 
S£f«eelra er««cs«ae of h«4ith probltu U a Urf« 

CO 



fagiooal uadarseaadicjt and glanatnK (dSZ) 

Osvaloplnf and gainlag orlaacaeloa, purpoaa, aod 
Suldlng prlneiplaa foe oat's }JLt%, This suy 
lavolva bacotticg 3oc« xACura, jaioiag inslgae Ineo 
and aceapeanea a< ona's aaaac^ an^ lialcaeloaa, 
•xptclanclng and avaranaaa of pocsooal gcowch and 
davalopMae, and raalltiag ciio abUXcy. ;o Infla- 
•nca :ha coursa oi ana's Ufa sl^ifleanel/. Ic 
^jo t&cludai ouUing daclalons and planning Ufa 
aeelvlelaa anp colas, coc iofta paopla, a aajor 
<io«poaanc 4rljfa ixom rall^loua or tplrleuaX 
axpavianeaa or aeelvlelaa. 



OmSFSlSOHAL OEVGLCFMpiT 



SUlaelonai ''^th parages, 
ralaelvaa (762) ■ 



slbllnas^ or oehar 



Having paranes, siblings, or oebar raXaelvas. In 
ebaaa ralaelonahlps ona ascparianeaa eoMtmleaelng 
vlcb <3t doing chlngs vlch :haa, vlsle^, anjof 
* Ing, sharing, undars sanding, balng halpad by, and ' 
halplng chaa. tha 2 aallng of balonglng and. having 
soaaona CO discuss chlngs wleh ts a Urga coapon* 

aae. 

«\ 

'0. Ralaelonj vlCh frlands (752) 

Having dosa fr lands. In chasa ralaelonshlpa ona 
sharas accivltlaa, Inearases, and -/lava. loporcane 
aapaees of chaaa raiaelonahlpa Involva balng 
aceapcad, visicing, giving and racalvlng halp, 
lova, ccuac, aupporc, and guidanca . 

S; talaelons vleh'soousa (alrlfrland or^ 

bovfrland) <92X) ^ - - ' 

ialng aarrlad or having a glrlirland or boyfrland. 
Tha raUeionahlp Isvolvaa lova, companionship, ' 
saanial saeUfactloa, undarseanding, canunicaeion^ 
appraclaelon, davoeioo, and coneaneitaae. 

3ociaiiglna (5U) 

Enearcaininf ae hoaa or alsavhara. ieeanding 
partlaa or oehar social faehatiafs. aaaelng aav 
particlpaelon In socialising organi£aelona and 
claba. 

Havlna and raisins childran (88S) 

Saving childran and baconing a paraae.' this 
raUeionshlp lovolvaa vaechln| chair davalopaane, 
spanning elaa vieh eha^ and anjoying eha». Also 
innludad ara ehiags lika aoldiht, guiding, halp* 
lng# appraciaclng, and laaming ttom zbm and wleh 
ehaa. 



Sfoeas Froa Tahla 2 of rXanagan and Xuaa-Cfe (X97S). 

*tha nuHbara in paranchaaaa ara eha pareauagaa of 

~-30-yaarwold*"'vho considvrai eh« ' ^ ij iga wg^ecT'Si 
Uvorcaae or vary lattorcane for ehair qoaUcr of 
llfa (naufaa & lUias-Sfe, 1575). 



:2rrcaccTUAi jsd cixfiAXivE oevnoPMSiiT 

H. tneallacgqal . davaloyqianc (842) 

Eiaaming, aecanding school, acquiring daalrad 
knewladga and aaneal abilieiaa, graduaeing, and 
problai solving, ochar aspaces Involva iaprovlng 
undarscanding, coaprahanaion or appraclaeion in an 
Ineallaeeual araa ehrough aeelTielas in or oue oi 
school* 

t- Craaelviey and oapsonal axorassion (30t) 

Showing inganuiey, orlglnailey, Imaginaeion In 
aualc, are, vrlcing, handlcrafea, draaa, phoeogra* 
phy, praceical or scianeific Mecara, or avaryday 
aeelvlelaa • this also Indudaa axpraaalng onasalf 
ehrough ft^ collaceion, a parsooai projacc, or an . 
accoBpii^bMne or achlr/aaane. 
/ 

J- PtasFiva and obaacvaclonal racraaelonal 
aeelvlelaa THZ) '. 

Parelclpaeing in various tcinda of paaalva racraa^* 
don* such as waeching calavlalon, Uaeaning co 
Bualc, raadlng, going co cha aovlaa, and going eo 
aneareainaaae or spores avanes. Ic also lovolvaa 
appraciacing tha art and baaucy in aany aapaccs of 
Ufa. 



Actlva and gartidpaeory racraaelonal 
. aeelvlelaa (552) 

Pareicipaeing In various lOnds of acelva racxaa* 
don, such as -spores, huneing, fishing, boaeing, 
coaping, vacaelan craval, and slghesaalng. Thla 
aay also involva such aeelvlelaa as playing sadan«» 
eary or aeciva goaaa, singing, playing an Ixisecu* 
aane, dancing, or seeing. 



CAAEEX OfftLOPMStT j 

L. OceaoaelQoal rola (job) (90%) 

Having inearaaelng, ehailanging, ravarding, 
worehwhila vortc in a job or hoaa. this laeludaa 
doing vaU, using ona' a abilieiaa, laaming and 
producing, obeaining racognlelon, and accoapllsh* 
ing on eha job. 

H- MaearlaX wali^bainc and financial sacuriev <73%) 

Having good food, hoaa, poaaaaaions, coaforcs, and 
axpaeeaciona of shaaa for eha fueura. Xdnay and 
f in a nci al sacurley ara eypleally iaporcane fae* 
eors. . For aoae paopla filling ehaaa naada Is 
prloarily ralaead co chair af forrs or ehoaa of 
< chair spousa. 

CIVIC 0EmoR!E)rr 



0. 



Aeeiviclaa ralaeint Co local and naeional 
tovacnaanea (45;S) 

Kaaplng iafotmad ehrough eha aadia; pareiclpaeing 
by yocing and ochar coaiunicaciona; having and 
appraciacing ooa's poUelcal* aocial, and rallg* 
loua fraadoa. Ona covfoaane die ehis Ineludaa 
having Uviag condieioiia aCfactad by ragulaelons, 
lava, ptoeadiiras, mi, poiieiaa of govarning agan- 
ciaa and eha in^viduaXa and groupa ehae mf luaoea 
and oparaea ehaa. 

Aeelvlelaa raUcad e o haloing or ancouraataa 
oehar oaoola («;t) ' s 

Silplag or aaconrigiiif adalea or childran (oeh^ 
ehaa ralaeivaa or cJLoaa ffiands). this cm ba 
dona efacqti gii oa a ^s afforea as an tndividaal ar ^a 
" a iniaFof aoaMrorginixacion* such as a church, 
chiki or voluaeaar group, ehae vorka for cha 
banafie of oehar paofla. 



ERIC 



lOTtXIPY AVAILABU 



27 



NAEP effort. Table 5, taken from Flanagan (1978), summarizes the broad- 
gauged goals of elementary and secondary eduction as enunciated by state 
governments within the past decade. Flanagan h^s noted their similarity 
to the critical dimensions affecting the quality of life of adults that 
have emerged from his research, as well as their similarity to the cardinal 
principles of secondary education as set forth in 1918 by the Gommlsalon 
on the Reorganization of Secondary Education- — '^health, command of funda- ^ 
mental proce^es, warthy home membership, vocation, civic education, 
worthy use of leisure, and ethical character." 

The value of having a concrete list of general educational goals and 
critical areas of adult life is that it can help cl^irify the various 
perspectives in terms of which the assessment objective3 might be eval- 
uated. In fact, such a concrete description can help various constituen- 
cies to clarify the perspectives from which they are ev^luatln^ assessment 
objectives, highlighting similarities and differences relative to the' 
views of others, and aiding in reaching a consensus. The convergence in 
the above descriptions sugglests that the prospects are good for achieving 
a consensus on a concept^l framework for NAEP. 

s 

To this point in time, NAEP has not had any general conceptual frame- 
wo^rk that is linked to the overall goals of education. Only one of a 
number of alternative possibilities is outlined above, albeit one that has 
much to recommend it. The potential benefits of adopting a workable 
framework are Immense, however, and we recommend that alternative options 
be identified and presented to the APC to begin consideration of this 
issue. 

s 

2. The developifSWt of priorities for assessment among areas, 
subareas, and specific objectives . 

This issue is 'perhaps one of the most sensitive for NAEP and is 
central to the responsibilities of the APC. Up to now, priorities hs^ve 
been primarily reflected in the frequency of the assessments in various 
areas, as noted earlier. One could undertake to count the numbers of 
Itemgi in e^ch aaaeaament thatJiav^ been related to each objective and thua 

-20- 28 



Table 5 



Educational Goals for Elementary and Secondary Education 
aa Adopted by State Governments 



Physical and Mdteridl Well Being 



Personal Development and 
Fulfillment 



22 A. Each individual must develop 
an understanding of the prin* 
ciples invoWed in the produc- 
tion of goods and services and 
of the skills relating to the 
management of personal rf- 
sources. 

41 B. Each individual must acquire 
good health and safety habits 
and an underitanding of the 
conditions necessary for 
physical and mental well' 
being. 

29 G. Each individual must develop 
the knowledge and respect 
necessary for the maintenance, 
appreciation, protection, and 
improvement of the physical 
environment* 

Relations with Other People 

24 D. Each individu^ must learn the 
rights and responsibilities of 
.family members and prepare 
for family life. 

36 E. Each individual must learn to 
develop and maintain inter' 
personal relationships and 
have command of social skills. 

Social, Community, and 
Civic Activities 

39 F. Each individual must come to 
understand and appreciate 
different cultures f govern* 
ments, races* generations, and 
life styles. 

43 G. E^H>indivtdual must /ram 
rights and responsibilities of 
citizens of the community, 
state, and nation. 



47 H. Each individual must mof^ 
the basic skills of reading, 
writing, speaking, listening, 
computation, and problem 
solving. 

38 1. Each individual must master 
the skills of constructive and 
critical thinking and decision 
making so that he or she can 
deal effectively with (frob- " 
lems in an open-minded and 
adapuble manner. 

^6 J. Each individual must^ain 
knowledge of the human 
achievement and experience 
in the areas of natural sciences, 
social sciences, humanities, 
creative and fine arts. 

41 K. Each individual must fain an 
eagerness for lemming and self- 
development be^nd the for 
mal schooling prob^ss. 

40 L. Each individual must develop 
a positive self-image and an 
understanding and apprecia- 
tion of his or her unique cm* 
pacities, interests, and goals. 

45 M. Each individual must select 

and prepare for a career of his 
or her choice consistent with 
his or her c^abilities, apti- 
tudes, desires, and the needs 
of society. 

35 N. Each individual must develop 
a personal philosophy and a 
basic set of value s^ morals, and 
ethics acceptable to society. 



29 



27 



O. Each individual must acquire 
the desire and ability to ex- 
press himself or herself ere- 
atively and to appreciate ere- 
ativity in others. 

Recreation 

P. Each individual must have 
knowledge of and skilb in rec* 
reation and leisure'time activ* 
ities for nonvocationil use of 
time. 



Xotas ; Inforaaeioo In tabXm la txom scaea gov^raiaa^ a^ectpe Arkansaa, 
Indiana, and Mlonaaoca* 

Tha figura ae t)m* latt of aach ^1 lodlcaeaa tha aunbar of aeataa 
ehae hava adope«d ic aa oaa of eliSlr tducaclooal goala, 

-21. 23 



Infer the Implicit priorities that NAEP has observed. However, no explicit 
priorities are evident except for proportional emphasis (percent of exer- 
cises) in the subareas of math and of science. The assumption appears to 
have been that all objectives are Important, if not equally important. 
The pro^jlem is that the numbers of objectives in many areas are more 
numerous than the number of exercises that could be administered in a 
given assessment, hence some objectives have obviously not been assessed, 
at least not consistently. Instructlbns for NAEP exercise development do 
not' appear to set priorities, and thei primary considerations in deciding ' 
on the inclusion of exercises are thq following: 

• Whether the exerclsie is ref elenceable to any one of the NAEP 
objectives — i.e., appears to/ be a legitimate assessment of 
the objective. / 

/ 

• Whether the exercise is reljfetlvely low in coat to admin- 
ister — individually admlnls/tered exercises are more costly 
and have been dropped in ri^cent years; performance exercises 
(e.g., measurement of small group ^participation in citizen- 
ship) are also more costly. 

J 

This has inevitably meant that sodie objectives have had lower de facto 
priority — either because relevant! items were not generated by exercise 
developers in the first place oJ? [because no cost-feasible- means of testing 
them had been devised. Such llm:|!ts are inevitable — resources are never 
unlimited, and choices have to b|fe made. NAEP could take a major step 
forward in communication with Itis audience and in stabilizing the indices 
of education progress that it c$n provide, however, by developing explicit 
priorities to guide the developtaient of exercises and their assembly into 
assessment packages. ^ 

\ 

We recommend that this issue be raised^ with the APC Because of the 
Sensitivity of this matter and the diversity of perspectives that poten- 
tially exist on priorities, we also Suggest here a multifaceted approach 
to providing NAEP with a defet^sible method of priority setting, one that 
makes the most appropriate us^ of the judgment and experience of all sorts 
of lay and professional jiersQns concerned with education. ThijB approach 
has been used very effectively ii^ th^ past to define the objectives of an 



-22- 



30 



\ 



assessment program* This past experience is useful to recount In that It 
indicates the potential of the approach to provide a reliable basis for 
determining and stabilizing the knowledge to be asseissed and its potential 
for clarifying and reconciling what might appear to be irreconcilable ^ 
differences in the perspectives among those who will be affected by the 
assessment* The circumstances of the previous use of this approach might 
appear different from the context of NAEP, and in fact the purpose was 
quite different— the development of content specifications for science 
tests to evaluate applicants to medical school. However, many of the 
basic issues and problems in arriving at a consensus on what was to be 
tested were similar and the lessons learned appear applicable. 

For many years before AIR undertook revision of the admissions test 
given by medical schools (the MCAT) , there had been increasing debate and 
dissatisfaction with the previous assessment tool. Meetings were^held 
over a two-year period concerning the problem of how to evaluate appli^ 
cants. A feeling had arisen that the scientific preparation and academic 
quality of^t>e applicants was such that attempting to make fine distinc- 
tions on the basis of more and more advanced and specialized preparation, 
in science was ill-advised. Conversely, more nontraditional students, 
women and members of minority groups, were applying, and there was a need, 
not only to provide a fair assessment of these applicants, but to provide 
a more diagnostic assessment of their strengths and any needs for remedia- 
tion were they to be Admitted. Yet other traditional differences in 
perspective existed between those teaching the basic sciences in the first 
two years of medical school, the clinical science facAty, and practicing 
physicians, to say nothing of the perspectives of undergraduate college 
faculty and advisors. College faculty had serious concerns about the 
impact of a national test on their curriculum, which lois not exclusively 
aimed at preparation for professional school. 



When AIR was placed under contract to develop the new assessment 
specifications in 1974, it was in part because we offered a means of 
resolving these differences. Among the persons most likely to be affected, 
however, there was still a high level of anxiety and cynicism about the 
pdsslbltlty of achTevlngnriro'ni^^^ 




O -23- 

ERIC 31 



AIR consulted vlth educators and studied curriculum outlines and 
texts used widely In Introductory biology, general chemistry, organic j 
chemistry, and physics at all sorts of colleges across the country. In*- 
eluding both the most and least selective schools and those with various 
missions and student body compositions. The result was a comprehensive 
and detailed outline of topics taught at any of these schools In the four 
subjects. The deans of all U.S. medical schools as well as refjresentatlves 
of all the special Interest groups were then asked to provlcle nominations 
of persons qualified by experience and breadth of perspective to evaluate 
the relevance of prior scientific knowledge to the medical school currlc*-* 
ulum and to medical practice from each of the following groups: jiM^ 
science faculty, clinical science faculty, physicians In practice, andy/^ 
senior medical students and residents. We specially sought nominations of 
women and minority group members. In the end, over 300 highly qualified 
nominees from all of the concerned groups and segments of the medical 
education community were constituted as an evaluatltn panel* The panel 
did not meet. Each Individual made deletions, additions, or modifications 
as needed and then provided Independent evaluations of each topic on the 
detailed outline. The first rating was of the extent to' which a student's 
mastery of the curriculum In medical school would be Impaired by a lack of 
understanding of the topic at the time of entrance, from **not at all 
Impaired" to "seriously Impaired." The second evaluation was of the 
frequency with which oplc was utilized In the practice of medicine. 

It w^s evident that the ratings were done In « thorough and thoughtful 
manner. 

When the results were examined. It became clear that there was a high 

degree o£. agreement among raters, regardless of their position or personal 

background, on the Importance of different topics relative to each other. 

The differences showed up primarily In the overall levels of the ratings, 

which did differ between groups. Thus, students tended to see the science 

material, as a whole, as being less Important to mastery of the curriculum, 

as did women and members of minority groups; medical basic science faculty 

* 

gave It highest ovejrall Importance. The agreement on relative Importance 
of topics, however, provided the key to setting priorities for the Inclu- 
sion of Items on various topics in the test, by providing a basis for 



eliminating material ^ fhe understanding of which was Judged unlikely to 
affect progress, lBitl|ir because It was not relevant to. medical training or 



practice or because it was covered in medical school and was not pre-- 
requisite* 

What is proposed for NAEP is that the process of review and revision 
of objectives be structured and supplemented by a similar systematic 
survey to obtain the Judgments of relevant knowledgeable persons on the 
Importance of each NAEP objective and possible additional objectives in 
terms of the following criteria: 

• the extent to which a student's mastery of the curriculum at 
the next grade level would be impaired if he or she had not 
mastered the objective;' 

• the extent to which the knowledge skills, understanding, or 
attitudes embodied in the objective make a difference in 
adult daily living. 

The individuals surveyed should include concerned individuals with 
recognized breadth of perspective and Judgment representing all face^ts of 
the educational world, including: 



chief state school officers 
legislators 

state and local school board members 
' school district superintendents 
school administrators 
teacher educators 
classroom teachers 
students 
lay citizens 
curriculum specialists 
subject aatta|r specialists 

representativeWof major professions and career fields 



These individuals should be selected for their breadth of expertise and 
experience and their interest in education. They should also be selected 
so as to include persons from all regions of the country, diverse ethnic 
backgrounds, and both sexes. A broad sample of organizations and agencies 
repreaentlng the permpectlvas to. Included ahould ha enliet^d to nQmlnMt 





-25- 




persons to be surveyed* Thus, for example, nominations could "be invited 
from the Council of Chief State School Officers, the National Association 
of State Boards of Education, the America^ Association of School Adminis- 
trators, the National School Board AssociaCdon, the National Parent-Teacher 
Association, the National Education Association, the American Federation 
of Teachers, and so on. A list of relevant organizations and individuals 
to be invited to submit nominations should be drawn up and specifically 
reviewed by the APC. The goal should be broad tnclusivettesa, with a view 
to screening only for expjertise and relevant experience and to ensuring 
sufficient representation of persons from various perspectives and back- 
grounds. An Objectives Evaluation Panel consisting of about 300-400 
persons is necessary to ensure sufficient numbers from each subgroup to 
permit meaningful comparison of 'ratings among subgroups defined in terms 
of their position within the world of education "^and the background of the 
rater* 

We recommend that the process of selection of the Objectives Evalua- 
tion Panel also be used to identify a subset of subjept matter experts, 
educators, and lay persons with appropriate expertise and interest to 
serve on eight Assessment Area Advisory Committees to the separate assess- 
ment areas* 

Careful thought will need to be given to the precise task to be 
presented to the Panel. As they are currently written, the NAEP objectives 
are too variable in format and specificity to be rated directly. There-* 
fore, prior to the survey, NAEP staff, with the assistance of expert ^ 
consultants in each area, would have to prepare a master list of educa- 
tional objectives in consistent form for each assessment area. This list 
would include all current NAEP objectives plus any additional objectives 
suggested by NAEP consultants and staff and by consideration of sets of 
objectives 'developed by other groups. Because t}jLe entire set of detailed 
objectives is likely to be extremely large, it should be organized hierar- 
chically within subject matter areas and grade levels: primary, inter- 
mediate, and secondary. ^ 

34 



Th^ overall goal is to> obtain evaluations in the context of the 
entire set of NAEP objectives, not only on a dlscliillne-by-discipllne 
basis. Thus, all panelists should be aslced to rate all objectives on the 
two criteria listed above* Specific subsets of the panel, notably subject 
matter specialists, teachers, and those dealing with curriculum In partic- 
ular subject areas and at particular grade levels, should be asked to make 
ratings at a more detailed level within their special areas of expertise. 
The detailed objectives should be available to all evaluators, and the 
evaluators should be asked to eliminate objectives they consider unimpor- 
tant In each area prior to assigning their ratings. 

The results of this survey would then be analyzed by NABP staff both 
for the ratings of the panel as a whole and by subgroup, and the results 
should be summarized for the APC and the Assessment Area Advisory Commit- 
tees. NAEP staff would use the results to prepare: 

1. recommended revisions to the objectives, 

2. a plan for emphasis In the assessmehts In terms of the 
numbers of Items to be allotted to assessment of particular 
objectives, and 

3. sets of overall exercise scoring weights to be used In 
constructing summary performance measures based upon the 
rated Importance of the objectives* 

Different weightings could, for example, reflect the importance of 
objectives in relation to the curriculum progress perspective (criterion 1) 
and in relation to' the knowledge application perspectives (criterion 2)* 
Other weightings might reflect the perspectives of particular regions of 
the country* Eventually, specific state and local education agencies and 
other bodies might Mh||ko request special reports with weightings of 
objectives in relati^^o some alternative pattern of emphasis* As long 
as the pattern of development and administration of exercises across 
objectives is controlled appropriately, the ^umber of special indices that 
may be reported is virtually unlimited. The ratings, however, would 
provide a very defensible basis for the development of basic Indices of 
progress to be reported on a regular annual basis* And given some Stan*- 



dardlzatlon of the fornutt of objectives across areas, it would be easy to 
generate special across*-area Indices such as Indices related to particular 
areas of adult life or to particular skills (e*g*, declslon'-maldng). 

In future years, the rating process could be repeated In briefer 
format to ensure that the objectives remain current or are revised accord-- 
Ingly. Our experience suggests that this sort of process yields results 
that are much more stable and reliable over time than the results of 
committee consensus-development efforts. Thus, we anticipate that It 
would be desirable to check on the currency of NAEP's objectives at 4-5 
year Intervals, with appreciable shifts anticipated on a longer time 
scale, perhaps over a decade. 

* 

3. The development of criteria by which student performance Is 
to be evaluated . 

From 4.ts Inception, NAEF has wisely avoided attempting to set national 
''performance standards" against which schools, districts, and states might 
be compared—whether these be so'-called minimum standards or standards of 
excellence* There Is an Issue, however, which Is often confused with 
standard-setting, that does need to be considered In connection with the 
statement of Individual objectives for NAEF. The Issue Is whether an 
attempt should be made to bring NAEF objectives closer to the Ideal of 
'*true" educational objectives — that Is, "Instructional outcomes described 
In performance terms" (see the Foreword by William M. Shanner to the PLAN 
Master Objectives , 1971, reproduced here as Appendix A). As Shanner 
pointed out, even some of the systematically constructed objectives of the 
PLAN curriculum forlgi^ades K-12 In language arts, mathematics, science, 
and social studies and guidance were open to multiple Interpretation 
because they lacked a description of the stimulus condition under which a 
student was to perform or were simply statements that failed to suggest 
any sort of criteria. NAEF objectives, drafted originally In a period 
when the concept of Instructional objectives was less familiar a^d when 
the thought of any kind of national assessment vflls potentially threatening, 
are subject to similar celticism* but the problem la lomewhat mote per- . 



36 



vasive. Am Shanner has pointed out, "to have critical coonents Wade about 
one's objectives should be taken as a coaplioent, since this can only 
happen when one has taken the trouble to think them out and write them 
down." In this sense, NAEP is deserving of high praise for its efforts J^o 
make explicit many important objectives of education. However, there are 
important reasons why objectives should specify stimulus condition8~ e.g. , 
" Given a written passage whose tone makes us judge a character's action 
unfavorably ,..." or " Given a bank's interest rate, accredit union's inter- 
est rate^ an amount of money to be borrowed ,..*" and so on. They should 
then specify what the student is to do and the criterion by which perfor- * 
nance is to be Judged adequate or Inadequate — in the lattei^ example, 
" determine whether a loan from the bank or one from the credit unioo/would 
cost less *" Another example might be "Given a list of staements describing 
group relationships, recognize those that show prejudice and those that do 
not ," or "Given the major digestive structure of humans, identify the 
function of each structure ," or again " Identify three ways in which an 
Individual in the United States can Influence the decisions made by his 

elected representatives*" 

■ * ■ I I 

Avoiding ambiguity and properly stating the detailed assessment 
objectives Is the only way of providing effective guidance for exercise 
development, and in the end, the exercises, not the objectives themselves, 
provide the measures of educational progress. Instructing exercise devel- 
opers in a loose fashion to "write items that are a direct measure of some 
knowledge, skill, or attitude stated in the objective" or to "be sure that 
they measure something which will be meaningful to report" does not pro- 
vide them with enough guidance to ensure that the exercise will be a valid 
measure of the objective* The basic objectives should do much more to 
constrain the writer to an appropriate task to present to the student in 
order to measure performance on each objective. 



Aanual Atgetsments of Learning Areas 



There is widespread agreement that the deterioration of performance 
of NAEP to the point that assessments cannot even be conducted once a year 
is unacceptable. ' Moreover, we claim, the early design decision to focus 
each year on a different assessment (or, originally, assessments) is not 
optimal for efficiently achieving the goals for which NAEP has been and is 
intended. 

We recommend that NAEP carry out annual assessments, spanning the 
space af skills that make up educational progress each year. This change 
to NAEP will 

(1) increase its utility by making it possible to respond to 
needs for data on emerging policy and research issues; 

(2) increase its utility by increasing the power and stability 
of the educational progress time series; 

(3) increase its utility by creating a basis for estimating 
the relations of educational achievement in different 
areas to each other as well as to program factors; 

(4) increase its efficiency by eliminating redundancies across 
areas; 

i 

(5) increase its efficiency by introducing a smoother flow of 
exercise development, data collection, and analysis and 
reporting activities; and 

(6) increase its acceptability t6 students and teachers by 
providing exercise packages that are more interesting and 
have higher face validity for the goals of assessment* 

It may appear difficult to reconcile these numerous benefits with the 
fact that the direction NAEP has taken has been away from broad annual 
assessments. In this chapter, we will (1) present an overview of our 
recommended change, (2) explain the benefits listed above and how they can 
be achieved, and (3) describe and estimate the costs associated with this 
change. 



31 



33 



General Specifications 1 ^ / 

Evaluation of a straBegy such as this requires joint conslderktlon^ of 
a variety of factors. To provide a basis for the evaluation of annual 
assessments, we describe hfere the overdll design we have In mind. We . 
recommend that the NAEP grantee carry out a single assessment each year, 
using matrix sampling as bef<j^re, but covering. objectives In most or all of 
the areas that NAEP\j^as designed to address, including: 



Reading Other Areas: 

Writing Career Development 

Mathematics Foreign Language 

Science Art 
Social Studies Music 

Health 



with exercises designed so as to assess: 



Knowledge Acquisition 
Internalization of Processes 
Ability to Apply Skills 
Attitudes toward Skills. 



ERIC 



The number of exercises in eacl^ content area may change from -year to year 
as Information needs as well a4 the skills to be learned In schools 
change. The previous chapter cjllscussed the process of objectives re- 
finement. 

We assume that between 500 and 700 exercises will be assessed for 
each age group each year. In toughly 30 packets of 20 exercises each. The 
critical change Is that each packet would cover most. If not allt of the 
areas of the assessment . To allow estimation of correlations between 
Items In different packets, each student would complete three of these 
packets. In carrying out this plen, each selected student would need to 
take pert In a two-hour testing session. (We have not encountered any 
strong opposition to this Increase In testing time In our discussions of 
this Issue with leadlnt state educators, especially If useful feedback 



-32- 



could be given to the schools; however, our plan would also work If test* 

} • 
ing could not* be\ done In two-hour seisloos*) 



Under thl9 design, somewhat fewer students are needed at each age 
group or grade than in past assessments (roughly 20% fewer), because of 
the doubled administration time and the increased power and efficiency . 
provided by using inl;i^grated instrumentation each year. Each assessment 
exercise will be given to roughly 2000 students (requiring a total of 
20,000 students in each age group), at^d every exercise will be paired with 
every other exercise for approximately 140 students* AJ.tho^gh the data 
base for each exercise taken by itself is smaller than in some of t^h^past 
assessments, the power of the assessment will be much greater because 

(1) exercises will be combined rationally into composite scores and 

(2) results of assessments can be combined, as well as compared, from each 
year to the next. 

The Schedule of activities for NAEP under this modified design would 
constitute a two-part '*fugue*'* The time period for a particular assess- 
ment, from initiation of exercise writing to publication of the main 
report and public-use tapef, would be two years (one year of preparation 
and one of data collection), so that at arny particular time work will be^ 
in progress for two successive assessments. 4^^he beginning of each 
two-year assessment period, projections of issues that will be important 
two years later will be made and presented to the Assessment Policy Com* 
mittee* These deliberations will lead to the weighting of different areas 
in the development of the assessment forms (i.e., how many Items to Include 
in each area) and possibly to the occasional identification of new objec- 
tives that may need to be added to the assessment. They may also lead to 
plans for oversampling of certain subpopulattons of schools or students 
during the assessment. 

J 

Exercise writing can occur throughout the year at times convenient to 
the item writers, but during the first three months of each assessment 
period, the goal will be to complete the pool of exercises to be used in 
that assessment* The following alx months will be used for review, try- 



ERIC 



0 -33- 



outs, revision, and approval of drafted forms* At the same time, ' schools 
will be selected and contacted for participation In the following year* 

The Instruments that are developed will be^adminlstered according to 
a sbhedule like that currently In place, in order to maximize thecompara;^ 
blllty of the data across time. Activities in the spring of eachjy^r 
will focus, rather Intensely, on da^jet processing and the production of 
reports and public-use tapes* In tHls way, the results of each aimual j 
assessment can be released in the summer fdllowl^ the assessment* Or 
course, this does not mean that numerous secondary analyses will not be 
carried out at a more deliberate pace* ^ 



potential Benefits 

Let us now consider the benefits that will accrue due to this Innt^va* 
tion in design* 

First, because all content areas are potentially covered by the 
assessment each yedr, the data base will be much more responsive to prdr 
gram evgtluation and policy needs , both before data collection and after 
the factV As soon as an issue appears to be emerging, while the specific 
objectjl/v^^s for one year's assessment are being weighted, planning can be 
carried out to collect the background and program data fxom the students 
and schools that will make the achivement data relevant to the Infdrmation 
needs* There /i^i 11 be no need to wait several years for the next cycle in 
which the re^vant assessment (eCg«, sc^en^e or reading) will be performed* 

Even after I the emphases of an assessment have been decided upon, the 
possibility for annual comparisons, related to program data collected from 
a different source, will dramatically increase the relevance of the 
achievement data to policy research and evaluation need** For example, 
with the current design, it is -impossible to evaluate the possible effects 
of the change from "Title I" to "Chapter I" on achievement, because thl 
basic skills assessments have Insufficient "resolution." If comparable 
assessments had been carried out each year be^een 1975 and 1985, we would 



41 



be In a better position to make statements about the importance , or lack 
thereof, of the federal role in compensatory education* 

^ This design is also highly compatible with a partially longitudinal 
school sampling design, in which each school would be visited in two or 
more assessment years* The use of an integrated annual assessment Instru*-* 
ment covering essentially the same areas each year is, of course, neces- 
sary if the longitudinal study is t;o be part of the overall assessment. 

Even if there were no other advantages, tjhis greater flexibility for 
use of the NAEP data to address policy research and evaluation questions 
would, we believe, outweigh the costs associated with the change (which 
are discussed in the section below)* 

The second major advantage that derives from annual ^assessments that 
integrate most or all content areas is the greater stability and power in 
the time series • Over 15 years, at most four assessments have been per- 
formed in each content atea, and this frequency is quite Insufficient to 
begin to develop time-series projectidns or tests of hypotheses about 
relations of various factors to educational achievement* At the present 
. rate, twenty additional years will be required before the data series 
become useful for these purposes; however, addition of a new data point 
every year, even if it were somewhat less reliable , would dramatically 
reduce the waiting time before the data could be used for econometric and 
other modeling* Moreover, the currently accepted five-year periodicity 
for the core assessments is too long to provide the basis for investigating 
relations to events that change from year to year* If the math assessment 
results in 1982 differ from the results in 19'78, it is impossible to 
relate the changes to exogenous factors that changed over those four 
years* If math assessments had been carried out annually* however, pos- 
sible effects of such factors as the growth of computer awareness, of 
unemployment, or of school closures on math skills or attitudes might be 
examined* ' 

The third way in which the proposed design would increase the utility 
of MAEP data is by providing data for estimating relations among achieve* * 

<* ' ' -35- AO 

ERIC 



I 



tpent In different areas > Because each exercise set given to a student ^ 
would span the range of educational objectives, and because the recommended 
matrix sampling design would ensure that an adequate number of students 
respond tq each pair of exercises (by combining all possible pairs of 
packets of 20 exercises when composdLng the two-*hour test booklets), com** 
plete interitem correlation matrices could be estimated* This will open 
up a broad vista of analytical uses for NAEP data not now possible* 

Nearly all policy research issues,^ evaluation issues, or educational 
practice issues that can be addressed empirically require data on rela* 
tions, not merely on levels of single variables* In many cases, the most 
important relations are between an achievement score and a vector of 
hypothesized factors that affect that score* This is not the only impor- 
tant type of relation, however* Rielations among achievement teiat areas 
are also important* For example, £he following issues requite these 
relations* ^. 

• Is low science achievement by particular target groups 
related to low reading scores? 

• Does a program that raises reading scores also raise science 
s^Cores? 

• Are problem solving skills generalized across the areas of 
science, math, social studies, and reading? 

• What abilities are related to a positive attitude toward 
activities in science, or in music or art? 

• For which content areas is variance between schools largest, 
for which Is variance within schools largest? r 

• How many factors characterize wi thin-school achievement? . , 
Between-school achievement? 

Each of these three ways of increasing utility, greater responsive*, 
ness, greater power and stability, and greater research applicability 
would itself outweigh the costs of this change In the design of NAEP, but 
there are also Increases in the efficiency of NAEP data collection that 
compound the advantages of the integrated annual assessment* 

-36- 

43 ' 



First, by considering Che space of eaucaCional objectives in a unified 
framework, substantial redundancies in test items can be eliminated » For 
example, it will be unnecessary to include subject master reading items 
twice, both as assessments of reading and as assessments of the subject 
matter* And, to the extent thlt problem solving involves the same skills 
across areas, this type of skill need not be covered as completely in each 
and every area* The implicatibn of the elimination of these redundancies 
is that more independent information can be gathered in the same time 
period* Items whose response could be very well predicted from other 
reponses can be replaced wlth|more informative items* ^'^ 

Another benefit to efficiency is in the flow of operations in the 



conduct of NAEP. "Start-up" 



virtually eliminated because 
going. In particular: 



cosj;s for different assessments will be 



the assessments will be continuously on- 



• the pool of exercise writers will be continuously monitored 
and Improved in all ^subject matter areas; 

• exercise writing can be done on a convenience schedule, 

^ which will increase the availability of top levels of exper- 

• forma preparation and clearance attd procedures will be much 
more similar from year to year; 

• requirements for data collection staff training' will be 
reduced because of the similarity of procedures from year to 
year; 

• psychometric analyses aimed at improving the item pool can 
be standardized, thus reducing cost; and 

• a common format for an annual Report Card (an annual 
National Assessment summary report) will facilitate its 
quick production each year. 



Finally, we expect that the new format, with items from a variety of 
areas, will be better received by both schools and students. Students 
will like participating better because the assessment will be both less 
boring and less intimidating . The reduction in intimidation will come 
from the fact that It will be clearer that the assesraent is really not 4 



test of 'iay Individual student, because, so little time Is given to any 
particular content area. The attraction to schools will be thatftWe 
assessment package In any particular year more clearly covers the breadth 
of the educational objectives of the school, so that If the school chose 
to extend the testing to a larger fraction of Its students, the NAEP 
Instrument would be a reasonable ^ assessment tool for the school* The same 
logic also applies at the LEA and SEA levels* Consequently, we expect 
that the change In design may actually reduce efforts needed to Induce 
schools and students to provide the required data, even with the proposed 
Increase In Individual testing time* 

Potential Costs 

On the cost side of the equation, th^re are two factors to be con*- 
sldered: (1> that^data collection will occur each year, rather than every 
other year, and (2) that the assessment will span the range of subject 
matter areas each year, rather than focusing on one or two areas* Con-* 
earning the first point, AIR has estimated costs and finds that there Is 
no reason to suppose that assessments cannot be carried out every year 
within the authorized budget* The more Interesting questions center on 
the costs of an Integrated assessment vs* separate subject matter assess"* 
> ments* 

A first reaction to this propositi for change In the NAEP design might 
be that there Is only one essential cost Increase that must be taken Into 
account In evaluating Its feasibility and desirability: the cost of 
Increased testing time per student. If we are to conduct two-hour sessions 
Instead o^ one--hour sessions* However, In order not to overlook any 
substantial cost component, we must consider all three phases of the 
assessment In detail: Instrument development, data collection, and anal** 
ysls and reporting* The Increased testing time, of course, falls In the 
^ category of data collection costs* 

When comparing an Integrated assessment design to separate assessments 
of content areas, the lnsfcnuMiv|:s would be made up of approximately the 

ERIC 4^ 



same number of Items, of approximately the same types, so that Increases 
In Instrument development costs would be of secondary Importance at most* 
We have noted that, In fact, the integrated and continuous nature of the 
assessments may actually Improve the efficiency of Item development* 

Two cost Increments niay result, however* First, a choice will arise 
that was not previously present: quantitative choices of how many items 
in each subject matter area to include in each assessment* In one sense, 
this is an easier decision than deciding whether or not to perform an 
assessment in an area like consumer education, for example, because a few 
items can be added or dropped much more inexpensively than whole Assess- 
ments can be mounted* On the other hand, we recommend a caxeful analysis 
of issues and objectives at the beginning of each assessment and consider- 
ation by the Assessment Policy Committee (APC) of the appropriateness of 
the coverage proposed for each area* 

This choice may require substantial discussion, at least in the first 
year or two, because there has heretofore not been an opportunity to make 
such choices betveen areas in the design of NAE?* While the barrier of 
mounting a separate assessment may seem to many to be an acceptable excuse 
for failing to include some subject matter area, that argmnent does not 
hold for addition of a dozen items* To avoid undue expenditure of effort 
in this consideration, the NAEP grantee will need to prepare Justification 
for the choices prior to presentation to the APC, while being willing to 
alter the design on-#he basis of the deliberations of that comoilttee* One 
such Justification might the percentage of school time allocated cur- 
rently to the particular areas; another might be the relative number of 
items assessed by NAEP, summed over the past dozen years* More forward- 
looking criteria include focus on skills needed for careers, reducing 
emphasis on skills that are beet)ming obsolete, and emphasizing areas about 
which policy issues are arising* , 

The second potential cost increment involves the prudent review of 
the coverage of objectives in each area by professional associations that 
have\some responsibility for curricula in those areas* Although in the 
past Lt was necessary to maltt^in close communlcations^with a particular^ 

^ -39- 46 



association only when planning an assesament in that area» a more contin- 
uous interaction will be required when all areas are being simultaneously 
assessed. In spite of the fact that the resulting interactions will 
require careful management, we believe that they will ultimately benefit 
the effort to assess educational progress as a whole. 

i 

The major cost increment, as not*ed above, is in the data collection 
phase • This is primarily a cost to the participating schools and their 
students, rather than a direct cost to the government, so its analysis is 
particularly critical. 

Rather than conduct an assessment involving 25,000 students for one 
hour each, our proposal is to conduct an assessment involving 20,000 
students for two hours each. The reasons that two-hour sessions are to be 
preferred over one-hour sessions for the purpose of the assessment are 
(1) data collection costs to the government are smaller per examinee-hour 
and (2) it would be difficult to cover the range of subject matter areas 
and also collect important background information in one hour. The incre- 
mental burden is from 25,000 student hours to 40,000 student hours, but 
this needs to be placecl in the perspective of the,f3 billion .instructional 
hours that occur nationally at each grade level. The argument that the 
additional hour is a significant loss to the individual students is par- 
tially countered by the fact that the experience of the assessment may 



itielf be educational* More important is^e increased burden on schools* 



scheduling that may occur when planning for a two-hour session instead of 
a one-hour session. To minimize this burden, care must be taken to work 



administrators we have talked to agree that the difference between one- 
and two-hou4r testing sessions is an important but not crucial factor in 
deciding on participation in NAEP and that two hours would be quite accept* 
able if a reasonable rationale were presented. 

One particular cost that has been suggested to be^assoclated with 
testing across subject gutter areas is the increased porportion of time 
needed for instructions. This position is questionable, however, because 
to A great extent Instructions focus on the foMat of exercises, not their 




with schools at an early point to set up convenient schedules. School 



40- 47 ' 



content. It should be possible to define exercise formats so that the 
same formats occur across the various topics* Hultlple-*cholce Items In 
biology, health, history, and reading Inference skills should not require 
separate Ins trudt loos. Thus, even though several content areas are con- 
tained In an exercise booklet, as long as the exercises are grouped by 
format, little or no additional Instructional time should be needed. 

Finally, we need to consider costs associated with analysis and 
reporting. For the primary assessment report, the costs should be de- 
creased, not Increased, because the format and content vfill remain con- 
stant across assessment years. On the other hand. If one wishes to focus 
on a particular subject matter area, more powerful analyses can be done by 
combining data across assessments. This Is essentially an extension of 
''matrix sampling" to the dimension of "years," and It requires the same 
type of combination algorithm as the other score generations based on 
matrix sampling. Therefore, we do not expect additional analysis costs to 
b# large. 



Conclusion 

From this evaluation, we conclude that the benefits of this design 
change to NAEF significantly outweigh Its costs. In addition. Integrated 
annual assessments would be especially beneficial when combined with 
certain other design and procedural changes discussed In this report (for 
example, issue-based weighting of objectives, psychometrlcally sophisti- 
cated item development and analysis, or computer-assisted testing)* 



ERIC 



43 



-41- 



M«a»ttr«a«nt Pounded on ModTn P«ychoai«tric Theory 



Background 



When Che Nation*! Aaaesaaent waa being planned In the late 1960*, It 
was realized that traditional approachea to behavioral aeaauresent would 
have to be t»odlf led very aubatantlally to realize the goal^ of the aaaeaa- 
oent. Conventional aampllng and paychometrlc technique* uaed coanonly In 
pychologlcal and educational measurement are designed to assess individuals 
on some psychological variable or achievement construct (see Lord & Novlck, 
1968). Examples of the former are measures of locus of control (e.g., 
Rotter, 1966) or- test anxiety (Handler & Sarason, 1960). Example* of the 
latter are typical vocabulary or arithmetic tests used to assess attain- 
ment of individual students. 

The founders of NAEP were correct in realizing that the National 
Assesfment should not rely on conventional psychometric methods. However, 
they were not successful in developing an alternate methodology that 
solves the nonstandard psychometric problems that an assessment presents. 
Furthermore, piecemeal modifications to the original strategy introduced 
in recent years are equally unsatisfactory. The following discussion 
describes the saUent characteristics of the national assessment, the 
methodological problems these characteristics poie, and two attractive . 
solutions to these problems. 

The goal* of a national assessment differ from conventional teeting 
in *t le**t two^mport*nt w*y*. Fir*t, me**urement *t the level of the 
individual 1* nod * go*l; reeult* *re not to be used in making decieion* 
on an indlvidualiexamin**. In*te*d, *n ***e*sseat should be designed to 
****** a group or an aggregate so that declaion* about the progrea* of the 
group «•'« whole can be made. Ultimately, the "group" to be evaluated 1* 
the entire nation; however, smalUr units of aggregation such as regions 
df the nation, states, districts, and types of schools are also of 
interest. 



♦ 

-43- 



43 



The second important difference concerns the type of material to be 
assessed. An assessment of educational progress should not be primarily 
concerned with measuring basic psychological constructs such as Intelli- 
gence or spatial ability. Rather, an assessment Is more properly concerned 
with measuring attainment In a large number of specific skill areas that 
make up the Curriculum la the schools or are thought to be Important for 
functioning adequately In society. 

Aptitude versus Achievement 

In the educational literature, this distinction Is mad« between 
measures of aptitude and achievement (see DuBols, 1969; Green, 1974; Snow, 
1980). Although the difference between the two conceptf Is by no means 
clear-cut,- achievement. In general, refers to degree of mastery of some 
specified performance, while aptitude refers to an individual's ability to 
learn In the future. Some have thought of aptitude as stressing Inherited 
ability, ease of acquisition, or relative fitness. Others have called 
aptitude -generalized achievement"' and emphasized capacity to learn, solve 
problems, and reason logically. 

It Is clear that a major focus of any educational assessment should 
be on achievement rather than aptitude. Rather than reporting only on a 
single generalized achievement score, such as the verbal SAT, an assess- 
ment must report on attainment In each of a number of diverse skill 
areas. If there Is any proper analogy to the Scholastic Aptitude Test, It 
Is to their so-called "advanced- tests In specific content domains (e.g., 
foreign languages, physics, advanced mathematics) rather than to the two 
(much more publicized) basic aptitude scores. 

The psychometric problem this creates Is that the assessment must 

> 

cover a highly multidimensional space and report on very specific content 
areas. Because an assessment does not focus on a few basic psychological 
variables, the conventional psychometric model of items as multiple indi- 
cators of a single ability may not be appropriate. In fact, iteds appro- ^ 
prlate for an assessment can be highly curriculum dependent. That is, a 



correct response to an Iten by a student nay depend much more strongly on 
the Instruction the student has had rather than his or her ''ability.'* 
Thus, a psychoaetrlc model that presupposes that a correct response Is a 
function of the examinee's "latent ability" may be less applicable to the 
psychometric problems posed by assessments than to the problems posed by 
the measurement of aptitude or ability* 

I 

Multiple Matrix Sampling ' 

These differences led the founders of NAEP to construct a design that 
differed radically frc^l^the design of a conventional large-scale testing 
program* Perhaps the most dramatic difference is In the area of sam- 
pling. Because the assessment of Individual students assumed no Impor- 
tance , It was decided to employ nonoverlapplng multiple matrix sampling 
techniques rather than conventional examinee sampling. With this approach, 
each examinee responds to only a few Items corresponding to a particular 
objective but responds to a broader range of Items than would be possible 
with examinee sampling. The latter feature Is important, given the re- 
qulrement that the assessment report on attainment In many diverse content 
areas. 



Used In the context of assessments, multiple matrix sampling has 
several Important advantages. First, proper execution of Item-examlnee 
sampling will yield more precise estimates at the^^^oup level than will 
examinee sampling. Second, It facilitates the administration of a wider 
variety of Items within fixed time constraints since each exaodnee does 
not have to respond to all items in the entire assessment. Third, it 
lessens response burden on the schools and students and serves to lessen 
fears among students Chat the results will be used to evaluate them indi- 
vidually. 

The use of multiple matrix sampling is unfamiliar to most educators 
and researchers, however, and its use creates several added complexities. 
The most important is that the increased precision for measuring attain- 
ment at the group level can only be realized if the appropriate estimates 



are computed from several matrix samples* This procedure is unfamiliar to 
most researchers, who a^e familiar only with constructing scales within a 
single matrix sample (i*e., dataset) * The necessity of constructing a 
scale from items in several matrix samples is hard for most person^' to 
grasp* To illustrate this point, consider the following example. 

Suppose a researcher is interested in measuring some achievement 
variable (say addition of fractions) at the classroom level. Fifty class- 
rooms constitute the primary sampling units. Within each classroom, ten 
students are randomly selected, and each student responds to one randomly 
assigned item out of a set of ten items. Each classroom responds to all 
items, but each student only responds to one (randomly assigned) item* 111 
this instance, it is clear that this design assesses classroom achievement 
more precisely than a design in which one student per classroom responds / 



that the researcher assemble each classroom* s score from the responses ^f 



be reported as an estimate of the score for the population of classrooms 
is one score based on five hundred individual responses, each to one of 
the ten related items* The relevant components of this single estimate 
are the scores from each of the fifty classrooms that constituted the 
primary sampling units in the study. 

Although this is not a difficult procedure, most educational re*- 
searchers are unaccustomed to it. In fact, the equivalent of this has not 
been done by the National Assessment* Instead, responses at the level of 
the item within a matrix sample, are reported* In our example, what would 
have been reporte,d by NAEP would be the proportion of students responding 
correctly to each of the ten items* This would produce ten statistics, 
each based on fifty scores (one student per classroom)* 

It is apparent that this procedure does not capitalize on (1) the 
fact that a ten-item scale measuring one variable exists, (2) precision at 
the classroom level has been maximized (for the ten-item test), or (3) pre 
cision for the overall estimate has been maximized* In fact, the current 



to all items* However, the logic of the matrix 




each of its ten students* The ^core for each classroom is composed of ^the 
ten item-examinee samples administered within that classroom* What shciuld 




ERLC 



46- 



NAEP proctdure of wl thin-booklet (i.e., within aatrix sample) reporting 
undermines a major technical advantage -of matrix sampling and leads to 
fragmentation of the riieults of the assessmeiiC. The latter point is 
discussed In greater deitail below. 



Reporting Results from 



Individual Exercises 



In response to the need of the assessment to focus on achievement 
rather than aptitude, l:he original planners of NAEP decided that the 
reports on educational progress should take an unusual form. Rather than 
constructing any type of scale score, NAEP reports would be written in 
terms of responses to each individual exercise* Such a strategy is con-* 
sistent with the goal of reporting in very specific content areas but has 
some severe limitations. First and foremost, for the meaning of . the 
percentage correct to pe interpretable, the exercise must have an impor- 
tance that is self-evident and unambiguous to the reader* Such conditions 
are approximately met'^jin opinion research, such as the Gallup poll and its 
competitors* For example, questions pertaining to voter preference in a 
specific election hav^ meaning on their own, without appeal to any psycho- 
logical construct (e*g*, liberalism or conservatism) or as being represen- 
tative of some domain!* This so-called "fixed-item" approach works well in 
social survey research in which the responses to items may be interpreted 
at face value. This! Is especially true when the question pertains to some 
particular action the respondent may take (e*g*, voting preferences, 
response to a draft] notice) * In such instances, the question is not 

Xa'! 



thought of as pne 



larger univerie of questions. 



ERLC 



Very few educational test items have such singular importance, how- 
ever. More typically, test items are interpreted as representatives of a 
population of test items that could be written to assess a particular 
skill* The crucial point is that no single item is accej)ted as the defi* " 
nition of the skill * Instead, it is accepted that several Items define a 
domain and that examinees respond probabilistically to these items (i*e., 
some examinees m^y get en item right due to guessing, and others may get 
an Item wront due to ca/relessness) * llhus, # strategy of reporting at the 



r 



-*7- 



53 



item level Is fraught with Intarpretatlonal difficulties. In the area of 
'the NAEP matheoatlcs assessment, Haertel (1981) posed the basic Isaue: 



Only 72 of ' scventeen-*year-olds could correctly solve the ^ 
equation (x-2)2-9 for x, but In another sample, 18Z could 
-find the solution set of x2-5x4«-0. Is the difference 
due tdl the wording of the problems? The particular nua-* 
bers? The format of the equations? How ar#' ilh to general* 
Izr about the proportion or seventeen^year-oflds who can 
aolve quadratic equations? How would the p-^alues change If 
these Items were multiple choice, say, rtther than free 
response? There Is no way to tell* 

dn the one hand. It Is clear that the percentage of correct reaponaee 
reported for exercises such as these conveys llttlt. If aay, meaning since 
those percentages are a function of the fonMIt of the exercise, dlatractors 
used, and difficulty of the particular exerclae sj^em* But on the other 
hand, the public and the educational community Is^lnterested In knowing 
about level of achievement In very specific content areas auch as **aolvlng 
quadratic equations The requirement tha^an asjiaaaaant report on 
attainment In many skill areas does not. In fact,! fraia It from the require- 
ment that those reports be In some Interpretable letrlc that Is Invariant 
with reap^ci to choice of exercise set within a sdll area. ^ 



Latent Trait Analysis 

i 

An Important challenge to NAEP Is to developi and use a methodology 
capable of reporting In specific skill areas wlthbut becoming tied down to 
specific exercises. In fact, what la needed Is methodology that directly 
and unambiguously addressee questions such as "Hcjw are we to generalise 
about the proportion of seventeen-year-olds who ^an solve quadratic aqua*- 
tlons?*'* Such a methodology should be capable d£ addressing questions 
phrased In terms of the skill areas theoMelvea, Independent of particular'' 
exercises chosen within a domain, and produce eaflmates Invariant with 
reapect to the exercise-examinee sampling procedures* ^ 



A natural place to begin development of such a methodology Is latent 
trait or so-called "Item response" theory. Originally developed In the 
cont^t of the measurement of Individuals, this family of models can be 
used to produce scale scores In an arbitrary metric In Interval scale 
units* Unldtlmenslanal exercise sets are produced In the test development 
stage and calibrated In preliminary studies. Once the eharacterlstlcs of 
the exercises are knovm, any subset of exerelses In the Item bank can be 
administered and Individuals^ scores on the latent trait can be estimated 
from the results These scale scores, are comparable even if some examinees 
get different subsets of the exercise set than others. 

Borrowing from Item response theory, Bock (1976, 1981, 1982) and his 
colleagues (Mlslevy, Reiser, & Zlmowskl, 1981; Reiser^ 1980) have made 
considerable progress In adapting latent trait methods to the unique 
'problems posed by assessments* As la Implemented In the current design 
for the California assessment, scale scores In each of more than ^Ixty 
specific skill areas are computed using latent trait models modified to 
handle multiple matrix sampling* In this way, scale scores In the domains 
of Interest can**be reported without depending on specific Items* The^ 
methodology depends only on the development and maintenance of a bank of ^ 
calibrated exercises In each content domain* New exercises can be added 
to the assessment as old ones are released to the public without compro-- 
mlslng In any way the ability of the National Assessment to measure 
change* Furthermore, results of the assessments can be reported directly 
In terms' of the skill areas of Interest rather than In terms of specific 
exercises, whose coverage of the skill area Is Incomplete and whose psycho-r 
metric characteristics are unknown* 

^ The conventional machinery for latent trait estimation must be gen- 
eralized to handle the complexities created by multiple matrix sampling* 
Since exercise-examinee sampling procedures may dictate that any one 
' student only takes a very small number of exercises per skill area, latent 
trait estimation at the level of the individual is very imprecise* What 
is needed is a methodology for defining the latent trait at the group 
level instead* /Such a generalization of Bock*s (1976) model was performed 
by Reiser (1980)* In Reiser's model, the probability of a correct re- 



ERIC 



sponse to a particular exercise by a student selected at random from the 
group Is a function of the exercise parameters and the average level of 
attainment In that group. The latter Is a function of the main effects 
and Interactions that define that group. His estimation procedure pro- 
duces the Information about the population and subpopulatlons that the 
assessment Is designed to^ provide* ka advantage of Reiser's procedure Is 
that It Is designed to produce scale scores for a population from as few 
as one exercise per skill area per booklet. 

■ ' ) 

In the California asses'sment, each^ of 62 skill areas Is assessed 
using an average of slxteen^tems each* However, each examinee responds 
to no more than two Ite^ms per skill area. Using a Iktent trait model 
generalized for .gtoup data, scale scores at the unit of the school are 
reported. As Is the case for flatent trait methods designed to score 
Individuals, the method produces scale scores In well-defined units suit- 
able for the measuifement of change. Mislevy, Reiser, and Zlmowskl (198i)-^ 
used this procedure to study change In mathematics attainment from 1972 to 
1977. 

'Using such procedures, scale scores can conveniently be con^puted from* 
exercises In several matrix samples, because the exercises have been 
calibrated using the latent trait methodology, these scal^ are Invariant 
with respect to addition or deletion of particular exercises defflning that 
skill area. ' ^ 



* Latent Class Analysis ^ 

Although this generalization of item response theory is a very marked 
Improvement over the present practices of NAEP in that it uses information 
from all booklets in an efficient manner and reports in terms of scale 
scores, its applicability to highly curriculum-dependent types of exer- 
clses 1? open to question. For such exercises, the dimensionality of the 
space is obviously greatly affected by whether or not students have re- 
ceived instruction in the area the exercise assesses^ Thus, both the 
patterns of Interitem association and difficulty level are strongly 

9^ -50- 56 - - 



affected by the school curricula, rather than merely by the ability (apti- 
tude) of the students. ladiscrlminant use of methods designed primarily 
to assess stable characteristics of the person can be misleading if used 
in this context. i ' 

As an alternative to use of latent trait methods in an assessment 
context, Haertel (1980, 1981) has proposed that restricted latent class 
methods (Lazarsfeld & Henry, 1968) be used to model the item response 
data. The advantage of these methods is that their assumptions are likely 
to be more congruent with the nature of highly curriculum-dependent item 
responses. A distinctive feature of these models is that skills are 
treated as dichotomous: A given examinee either does or does not possess 
each skill. If an item requires only the skills an examinee possesses, 
then he or she can solve the problem; otherwise, he or she cannot. If 
such models are applied to dat^ arising from studies designed to assess 
individual performance, this might not be an appropriate assumption. But 
for describing populations, the models work well. The assumption of skill 
dichotomies corresponds naturally to the fact that responses are strongly 
influenced by whether or not students have received Instruction in the 
skill area. The methodology of the latent class analysis Itself is re- 
quired to cope with the probabilistic nature of item responses. / 

The methodology developed by Haertel for analyzing assessment data 
Involve the following steps: 

First, exercises are characterized according to the skills required 
to successfully solve them. Unlike latent trait analysis, which ordinarily 
requires that the responses are a function of only one latent trait, 
latent class analysis permits the researcher to study exercises that 
require several skills for correct response. 

Second, the union of all NaV^lJl;^ needed to solve all exercises is 
assembled. Each subject, then, is assigned to some skill profile baded on 
that subject's pattern of right and wrong responses to the set of exer- 
cises. 



-51- 



57 



Third, the statistical analysis assigns some probability' to each 
possible skill profile. The probability Is Interpreted as the estimate of 
the population proportion that possesses that pattern of skills. The 
analysis is probabilistic: it recognizes that the skills an exercise 
requires and an examinee possesses are the sole determinants of the proba* ^ 
bility that the examinee will answer correctly. In fact, there is some 
(hopefully low) probability that an examinee lacking one or more of the 
requisite skills will answer the item correctly and the (hopefully high) 
probability of a correct response by an examinee who possesses all the 
requisite skills. These are known as the false positive and true positive 
rates, respectively. The former probability is, in general, greater than 
zero due to guessing, and the latter is, in general, less than one due to 
carelessness* The analysis consists of estimating simultaneously the 
proportions of examinees in each latent class (i.e. skill profile) and the 
false positive and true positive rates for each exercise part. 

The great advantage of this method is that these proportions are 
descriptive of component skills rather than specific exercises. Further- 
more, they appear to possess the desired properties of Inyariance across 
examinee-exercise sampling that is crucial if such statistics are to be 
meaningful. In his study of math attainment, Haertel (1981) found that 
his estimates were not significantly different across NAEP booklets, and 
thus, could be combined to produce estimates for the population as a 
whole. Due to the structure of NAEP matrix sampling, this Invarlance held 
across examinee-exercise pairings; this is obviously the most stringent 
practical test of Invarlance. 

Like the latent trait methodology, the latent class approach frees 
the National Assessment from reporting merely single exercises, but in 
addition it permits a more fine-grained and theoretically defensible 
analysis of attainment in many different skill areas. Most notably, 
latent class analysis is very well suited for the analysis of patterns of 
skill acquisition, since it expllctly takes into account the fact that 
several skills may be required to respond correctly to a given item. 
Ordered or hierarchical patterns of skill acquisition may conveniently be 
studied. 

58 



The results are especially amenable to description In simple declara- 
tive sentences because the report is worded in terms of the proportion of 
the population possessing a given skill or pattern of skills (e.g., 
**Thirty*"f i^e percent of seventeen--year*-olds could solve linear equations 
in one unknown"). That is, the skill dichotomy assumption corresponds 
well with the layman's notion. of skill mastery . The lay public can easily 
understand the meaning of statements like of seventeen*year-olds can 

balance their checkbooks** or "...can understand labels on products in the 
grocery store." Such statements actually invoke the concept of the 
latent class and the idea of generalization across both stimuli and time. 

It is curious that the California assessment attempts to meet this 
need within the context of latent trait methods. To define ''mastery," 
Bock (1981) arbitrarily chose an 80% probability that a randomly selected 
student would get an item right. He then can report on the proportion of 
students who have reached "mastery." Of course, another arbitrary choice 
of mastery level would produce different estimates. Clearly, it is pref* 
erable to use an analysis that defines mastery level empirically. The 
methodology developed by Haertel accomplishes this. 



Conclusion 



Although the final results of a latent class analysis are worded in 
simple language, the technical problems involved in generalizing from 
specific exercises to component skills are far from trivial. Regrettably, 
the current simple-*minded policy of item-by-item reporting to address 
specific skill areas is not acceptable; to meet the original goal of 
reporting on progress in many diverse skill areas, it is essential that 
the modern psychometric techniques discussed here be brought to bear on 
the problem. 



In our view, latent trait analysis has a more limited place in educa- 
tional assessments than does latent class analysis. Latent trait analysis 
lends itself much more readily to the measurement of higher level and more 
generalized cognitive skills than of the present objectives of NAEP. 

ERIC 53 



However, it is clear that such variables do have a place in a National 
Assessment* The proodnence of Scholastic Aptitude Test scores suggests 
that such generalized variables, In fact, have considerable Impact with 
the general public. Similarly, the research coommunlty has shown that 
aptitude variables are highly relevant to policy questions (e.g., Cronbach 
& Snow, 1977). There Is no doubt that the generalizations of latent trait 
techniques for group data due to Bock and his colleagues ate the methods 
of choice for aptitude Indicators. For analysis of national assessment 
achievement data, however, the latent class methods provide a far more 
technically defensible and readily Interpretable means of reporting In 
specific skill areas. 



ERIC 



-54- 



Ctoaputer-Admlolgf red Testlag 



Feasibility ^ 

The basic questions of whether computers can be used to administer 
tests and whether adaptive testing Is technically feasible have already 
been answered positively. For a number of years, Frederic Lord and David 
Weiss, among others, have been doing research on these topics, and effi- 
cient procedures have been developed for using computers, whether they be 
mainframe, mini, or microcomputers, to administer tests. The flexibility 
of test presentation, control of administration, and sensitivity to re*- 
sponses promised by a future of computer-administered tests calls for the 
kind of leadership In technical Innovation for which NAEP was designed. 
It Is altogether appropriate, we believe, for NAEP to aim towards a goal 
of computer-based assessment. 



Practicality 

A question that has not been answered, however. Is whether computer 
testing Is yet a practical approach to testing, especially for an effort 
the size and scope of NAEP. Virtually all studies of computer-administered 
testing, especially adaptive testing, have focused on tests composed of 
multiple-choice Items where each Item is Independent of all other Items; 
and where the question, all options, and any associated stimulus material 
can all be displayed together on a single CRT screen display. The examinee 
malces his or her choice based on the oiaterlal shown on the screen and then 
the computer selects the next Item to be presented, either adaptlvely or 
In sequence, and presents that material. Multiple questions based on the 
same stimulus materials have been little used In studies of compute r-admln- 
Istered testing, nor have Items that are based on flgural or long textual 
materials. Item formats other than multiple choice have received little 
or no attention. VJhlle these limitations would place only small restric- 
tions on NAEP exercises In certain content areas, It Is clear that many 
NAEP exercises, perhaps even whole content areas, could not be adminis- 
tered under these restrictions. 



O ^55- 61 

ERLC 



f 



In addition to limitations In the types of test Items that have been 
used In computer-administered testing, computer testing has still not been 
Implemented widely, particularly at the elementary and secondary school 
levels. Even the military services, usually leaders In the adoption of 
new technology, have not yet made any large scale commitments to computer 
testing* There Is currently a program underway to computerize the admin- 
istration of the Armed Services Vocational Aptitude Battery (ASVAB), which 
Is administered to all potential enlistees* This effort Is still at least 
several years from Implementation, however, and Is Intended to be limited 
to multiple-choice Items. While the PLATO system might appear to be a 
major user of computer testing, the testing Is In fact an Integral part of 
the Individualized teaching/ learning system and not a stand-alone testing 
program* In terms of stand-alone computer testing, the larges.t use to 
date may In fact be with specialized mlcrocon^puter systems designed for 
use In the offices of counselors and clinical psychologists* These com- 
puters administer, scoi^e, and provide rapid feedback of results on tests 
and Inventories such as the MMPI* 

There appear to be two major reasons for this lack of widespread 
adl3p^lon of computer testing* 



• To date, the hardware costs associated with Implementing 
such an approach to testing have been quite high, although 
costs continue to decrease rapidly as the technology evolves. 

• Computer output format restrictions In terms of what can be 
displayed on a CRT screen have, for the most part, limited 
the content to stand-alone, self-contained multiple-choice 
Items* 



If these two barriers can be removed, and we feel that there Is a high 
potential for this happening, then there Is a bright future for computer 
testing In many areas of education. Including NAEP. 




-56- 




ERIC 



Rgcoaaended Peailblllty Studji 

We recoooend that a twonproaged effort be undertaken to detenalne the 
potential for application of computer touting to NAEP and to monitor 
changes in that potential over time. One aspect of this effort would 
involve negotiating with coaputef hardware developers to identify advances 
with the greatest potential either to decrease the cost or to increase the 
performance oF computer testing systems. To accomplish this end, the NAE? 
grantee should establish a semi-formal coauaunications network with hard- 
ware experts from various segments of the computer industry. Working with 
the advice and counsel of these individuals, particular attention should 
be paid to such issues as: 



• Should computer testing systems be developed around main- 
frame computers, minicomputers, or microcomputers? 

• Should a testing station consist of a terminal linked to a 
master computer, or should each station be a stand-alone 
computer? 

• If stand-alone computers are used, should each station have 
independent memory storage, or should they be linked to a 
common storage device such as a Winchester Disk? 

• What types of display options could be used? What about the 
use jof videodiscs for image storage and display? 

*• How does the current cost of an optimal system compare with 
' the printing and scoring costs that would be saved? 

I 

The other part of this approach should involve a small-scale (relative 
to ^the total INAEP budget) research effort to seek answers to some of the 
practical quebtions related to the use of computer testing in NAEP. The 
following iss^s should be investigated. 



• What lis the most effective way to administer various kinds 
of exercises that are not multiple choice in format? This 
would include exercises that require that the examinee fill 
in misfing information^ inrite sentences or paragraphs » 
produc^ or perform wome work, and so on. Would tests of 
response speed add important mediator information not avail- 
able fi^om peper-and-pencil tests? 



VHiat is the best way to present long or complex stimulus 
materials such as reading passages » charts » figures » and the 
like? Finding a satisfactory answer to this question is 
critical to the presentation of many NAEP exercises via 
computer. For example* as a part of an art exercise; the 
examinee might be required to look at a detailed or colored 
drawing or picture, which would require the use of a color 
display device. Similarly, as a part of a music exercise, 
the examinee might be required to look at part of a musical 
score or listen to a passage, which would also require a 
special display device* ^ 

The same sort of display factors apply to reading exercises 
where the examinee is to read a passage and then answer 
questions about it* In a typical paper^and-pencil reading 
test, the entire passage and all o£ the questions are avail* 
able to the examinee at all times, and the examinee is free 
to look back at the passage as oft^n as is desired* In 
fact, the most effective strategy for answering questions 
about a reading passage is to read the questions before ever 
looking at the passage and then read the passage to find the 
answers to the questions. With a computer-adainlstered 
test, however, it is likely that more than one complete CRT 
display would be required to present the reading passage and 
an additional CRT screen display would be required for each 
question. While the examinee could Jump forward or backward 
among the text passage and question screens by pressing 
keys, this is not the same as 'scanning back and forth by 
eye. NAEP needs to determine whether computer-presented 
exercises are equivalent to the same exercises presented in 
exercise booklets in terms of examinee scores, reliability, 
sind interactions with examinee characteristics. If there is 
a non-equivalence, it is possible that the computeir-based 
items irlll be found to' be more powerful. However, when 
comparisons are to be made with previously administered 
paper-and-pencil tests, appropriate adjustments may be 
necessary. 

What about the use of computer-controlled videotapes, video* 
discs, or slide' projectors to display needed information on 
a small screen or a second CRT display in place of a supple* 
mentary printed information booklet? While technically 
feasible, at the present time these approaches are rather 
expensive — so much so that their use would probably not be 
cost effective now. However, NAEP staff should continue to 
monitor developments in display technology so as to be aware 
when there are significant cost decreases in current tech* 
nology or when new technology develops. For example, 
altho^gh laser videodisc masteripg costs i2000 at present, 
it will cost i20 with technology currently on the drawing 
boards. 




• Ar« scores obtalnsd by conputer tsstlng snd p«psr*-*nd-p*ncll 
cssclng approAchss squlvalsnc? For sxaaplSi ws know thmc 
some exaaliwcs axptrlsncs test linxlety snd that thsre are . 
some Individuals who exparlance compucar anxlaty. Will tha 
combination of computers and tsats result In greater levels 
of anxiety and thus loWer test scores? Will the Introduc- 
tion of computer testing cause Hawthorne effects to occur 
(lfte«i where the novelty and perceived special attention 
causes the examinees to try harder and perform better)? 

e How will computer testing Influence the test-taking strate- 
gies of examinees? For example i when answering a paper-and- 
pencil test, many examinees skip the Items they find to be 
more difficult and come back to them later. What is the 
appropriate level of control of this behavior to Impose with 
computer administration? 

e What are the logistics^ of temporarily placing computers at 
schools for testing? Can this be done by local school 
personnel, or will NAEP-t rained personnel have to continue 
going from school to school to conduct the testing? (If the 
computer system required little supervision, the hardware 
and developmental costs would be partially balanced by 
decreased costs for administration personnel, especially for 
individually administered exercises.) 

e What are the logistics of combining data obtained from the 
many testing sites so as to produce the final data files 
upon which results and reports will be based? How much 
aggregation will take place in the field, possibly during 
the testing, and how much will be performed at a central 
location? 

e Should NAEP consider using branching testing in which an- 
swers to certain background questions will determine which 
of several tests an examinee should take? In foreign lan- 
guages, for example, it would make sense to give a French 
test only to Individuals who had studied French and a 
Spanish test only to individuals who had studied Spanish* 
The same approach might apply with a mathematics test, 
especially at the senior high school level* The examinee 
who had studied advanced algebra, solid geometry, or even 
the calculus might receive a different test from someone who 
had only taken general or business mathematics* Use of 
computer testing makes branching like this highly feasible* 



To obtain answers to these questions, and others that are likely to 
arise, the NAEP grantee should acquire a prototype multi-examinee computer 
testing system, which can be used to carry out empirical studies. Prior 
to acquiring such a computer testing system, however, HAEP staff should 



consult with experts in the computer hardware field to seek their guidance 
vith regard to anticipated future developeaents. The system acquired 
should be one that best anticipates and can take advantage of future 
developments in terms of cost and capabilities. It is possible that a 
customized, rather than an of f'-the-shelf , system vould be the most cost^ 
effective approach to computer testing in NAE?. Such a customized system 
would contain only those features required to carry out the testing and 
would omit other costly features that are not requiVed. 

The first studies would explore general questions like those mentioned 
above. Then, in the 1985^86 assessment, a particular small subset of NAEP 
might experimentally be performed using computers. The outcome of this 
demonstration would guide subsequent expansion or revision of plans for 
computer administration. 

« 

Conclusion 

' ' \ 

We have no doubt that eventually the National Assessment will be 
administered largely or entirely by computer, with great increase in 
flexibility, efficiency, and the amount of information collected per 
examinee*hour. The primary questions are merely when and how. We believe 
that the gradual exploration and introduction of computers into test 
administration is the optimal form of leadership role for NAEP to play in 
this area. 




E>tabli»hacnt of an Edxicatlotul Aagegsment Institute 



Introduction j 

r- 

The NAE? grant from NIE will only provide limited resources for 
scholarly inquiry Into large-*scale assessment methodology and utilization^ 
and It will provide no resources for Independent monitoring and critiquing 
of NAEP policies and procedures. As a result » AIR's proposal for a NAEP 
planning grant Included brief reference to an Independent ly-*funded Educa** 
tlonal Assessment Institute that would support Joint reeeach and develop- 
ment activities aimed at improving large-scale assessment theory and 
practice. Our planning grant activities focused initially on determining 
the functions such an institute might carry out and how it might be organ-* 
Ized. When collaborative activities with the Stanford University School 
of Education produced an apparently useful blueprint, we followed up with 
preliminary explorations to identify (1) requirements and sourceiT'for 
Independent funding and (2) initial planning steps that would be required 
to tap these sources and establish the Institute. This chapter provides 
an overview of the Institute's potential functions, organization, and 
funding. It concludes with a report on the current status of Institute 
planning efforts. The orientation of this chapter is necessarily centered 
on the Stanford/AIR locus of the Institute, but the concepts might equally 
well be applied- to another grantee. We expect that the Stanford/AIR locus 
will prove attractive to potential funding sources. 

Potential Functions for an Educational Assessmen t Institute 

' 

AIR and the Stanford University School of Education Jointly identi- 
fied five major functions that an Independent Educational Assesment Insti- 
tute might carry out: (1) support resident scholars, (2) Independently 
review and critique NAEP policies and procedures* (3) conduct research and 
training seminars, (4) sponsor an annual conference on lerge^-scale assess'* 
ment, and (5) Interact with the international assessment community. 



Resident cenf r l|or icholarly inquiry ^ Besed on the model provided*/ 
by the Center for Advanced Studies in the Beheviorel Sciences, which is 
located on the Stanford University campus, the Educational Assessment* 
Institute would be a jresident center for scholarly inquiry. It would 
provide six-month and twelve-month resident fellowships for researchers 
who merit these appointments. Fellows woald be awarded both living 
expenses and stipends in lieu of their regular salaries* Nominations for 
fellowship recipients would be solicited from the general academic com** 

munity; nominees who pass initial screening criteria would be asked to 

f 

submit appliciftions describing their research interests in the Institute's 
two major fields of Inquiry— large-scale assessment methodology and tech** 
niques to promote utilization of assessment information for improving 
education. Applicant credentials and statements of research interest 
would be reviewed semi-annually by an independent Institute Board composed 
of eminent scholars drawn from the Fellows of the National Council jfor 
Heasurement in Education (NCME) and Divisions 5 (Measurement) and 15 
(Educational Psychology) of the American Psychological Asspciation and ' 
from the senior AIR staff and faculty of the Stanford School of Educa-* 
tion. Selected applicants would normally take sabbatical leave from their 
current positions to spend a specified period (six or twelve months) in 
Palo Alto un^r the auspices of the Educational Assessment Institute* 
While in residence, fellows would have access to Institute-*funded computer 
and clerical support and all NAEP public-use resources, i^lbludlng the NAEP 
Item and data banks, Clearinghouse, and computer software] packages* They 
would also interact with senior AIR and Stanford School of Education staff 
through frequent in^'house seminars. 

Permanent center for scholarly review and critique of NAEP policies 
and procedures . In their recent volume Toward Reform of Prpgram Evalua- • 
tion, Cronbach and his associates (1980) presented two general theses that 
are relevant to this proposed function: 

Oversight by peers is the most promising means of upholding 
professional standards and of precipitating debate about 
strategic and tactical issues. 



And: 



The best safeguard against prematurely frpzen standards for 
evaluative practice is multiple » independent sources of 
critici|b. (p. 10) 



\ 



Several recent examples IJlustrate the practical value of these 
theses in relation to major edjtcational program evalliationg. In the early 
1970s, the Huron Institute was requested vto provide planning and monitor-- 
ing assistance to U.S. Office-t>f Education staff who were initiating the 
major Follow-Through planned variations study. Huron's role evolved over 
the years into that of scholarly critic; moreover, this role was influen- 
tial in helping to, shape the Follow^ Through evaluation in numerous ways. 
As Michael Caret's final report evaluating Huron's work makes clear, _ 
"measured against technical, orgaaizatibnal, political,' and social-scien- 
tific criteria, /Huron' s impact on the}, evaluation in the final yeaics was 
without doubt a healthy one'* (Caret ^ 1978, p. 68). Huron becamf a kind of 



broker as well as being a source of bright ideas, technical adv^e, and 
criticism— it smoothed communication between various parties involved in ^ 
the evaluation, exerting quiet influence both in Washington and in the 
evaluation contractors' offices. Caret (who is now oci the Stanford Uni- 
versity faculty) made impo.rtajit recommendations about how, in the future, 
monitors/critics might be selected to work on major evaluations; the 
language needs only minor editing to apply to an independent educatftnal 
assessment institute charged with infusing NAEP with fresh idj^as, perspec- 
tives, and constructive criticism: 



\ 



There are several criteria that might be considered in select- 
ing an organization to serve as monitor.... 

1^. The technical skill of the external monitor should, of 

course, be the equal of that of the evaluation contractor. 

/ 

2'. The monitoring organization should possess the ability to 
work closely with thd major organizational units involved 
in conducting the evaluation as well as the 'flexibility to 
shift resources easily from pn#imonitoring task to 
^another . ^ . : 

\ 3. The organization should have a certain amount of legitimacy 
.\ among the evaluation constituencies; ;that is, it should 



ERIC 



-63- 



63 



i 



hold a secure 9 Independent status based upon a serious and 
continuing Interest In the problems and programs being 
evaluated 

4* The^ organization should have 'a relatively strong research/ 
Identity; that Is, It should have a fairly coherent social- 
scientific approach to the problems of evaluation, In order 
to encourage a meaningful dialogue concerning evaluation 
methods and results, (pp. 72-73) 



More recently, Ghatles Murray (1980) made similar observations about 
the rdle played by an Evaluation flesearch Society ^(ERS) panel In shaping 
AIR^s NIE-sponsored evaluation of the Cities In Schools service integra- 
tion experiment* The independent ERS panel, constituted prior to the 
evaluation and maintaining ongoing bontact with it, w^s thought to be much 
more effective in giving useful (and heeded) advice than would have been 
the case with one-shot, usually post hoc reviews by more traditional 
scholarly critics. 

We anticipate that this function could be one of the most important 
to be carried out by the NAEP Educational Assessment Institute and would 
be the one to which permanent Institute-affiliated staff from both Stan- 
ford University and AIB^would devote a high proportion of their efforts. 

Center for research and trailing seminars . The Educational Assess- 
ment Institute would periodically /organize and sponsor resea^h and train- 
ing seminars (usually In cooperation with other scholarly organizations 
and institutions of higher educat/ion). The purpose of these seminars 
would be to share information about the techniques and findings of ongoing 
research into large-scale assessment topics. Institute staff and fellows 
would organize the seminars. Participants would be assessment researchers 
around the country who could benefit both by learning new techniques and 
by receiving peer criticism of their own work. 

As a general rule, seminar participants would be required to pay 
their own expenses. If desirable, continuing education crkdits could be 
awarded by the Stanford School of Education. Seminars would be announced 
through mailings to members of national measurement and evaluation profes- 



^ -64- 



sional , associations. (Especially well«-recelved seminars might also be 
replicated at professional association conventions*) 

In cooperation with other NAEP efforts to utilize new cooimunication 
technologies 9 the Institute would also sponsT^r an ongoing large-'scale 
assessment seminar via teleconferencing. Participants would all pay a 
small fee and would then be given access to a computer-based telecommuni- 
cations network through which documents, questions, probes, thoughts, 
dialogues, and group conversations could be shared, recorded, processed, 
and analyzed. All researchers with access to a microcomputer or an appro- 
priate computer terminal with a telephone modem could participate in this 
network without leaving their offices. Recent experience with such on- 
going teleconferences has begun t^ ^demonstrate their potential as useful 
and inexpensive tools for timely problem identification and definition, 
solution building, and policy evaluation. For example, the SPECIALNET 
teleconference network now operating under the sponsorship of the National 
Association of State Directors of Special Education (NASDSE), serves the 
following functions: ^ \^ ' 

• providing electronic mail service, Including person-to-per- 
son messages and group announcements; 



1! 



obtaining immediate feedback from all or predetermined 
representative samples of state directors regarding ques*- 
tions of interest to network members, including those 
related to possible national policies; 

sustaining ongoing semi.nars of interest to specialized 
subgroups of network members; 

facilitating short-term collection of evaluative data and 
feedback of results for individual network members or the 
NASDSE; and 

providing easy access to computer utility functions such as 
report generating and word processing programs for all 
network members. 



te\c 



AIR is operating a similar network, the VIM Network, to facilitate coordi- 
nation and evaluation of interactive videodisc use in basic skills instruc 



tion, under the sponsorship of the Division of educational Technology, 
U.S. Department of Education* 

Sponsor for the Annual Conference on Large*)^cale Assessment * Acting 
together with the NAEP project staff, Stanford Uni verity, and measurement 
and evaluation professional associations, the Educational Assessment 
Institute would assume sponsorship of the Annual Conference on Large-Scale 
Assessment* As it does now, the conference would focus on major research 
issxies regardl.ng assessment techniques and information utilization* When 
they register for the conference, participants would be asked to nominate 
research topia^^ they hope to see on the conference agenda* A team of 
national and state assessment experts would then be invited (and paid) tp 
attend the conference and lead discussions on those topics and others 
determined by the Institute Board* 

* The 1982 Twelfth Annual Conferi^nce in Boulder, Colorado, attracted 

over 260 attendees who were representatives of state and local education 
agency evaluation offices, college and university faculty, and national 
professional associations* Future annual conferences, which we recommend 
be held in Washington, D*C*, would be expected to attract an equally large 
or larger number of participants, especially in view of' the fact that 
attendees would be able to help structure the conference agenda* 

U*S* locus for liaison with the international assessment network * A 
small but active network^ of international lissessment scholars has sprung 
up over the last 15 years* This network is headquartered at the Interna- 
tional Association for the Evaluation of Educational Achievement in Stock- 
holm* In the past, U*S* participation in the Association has largely be«n 
on an ad hoc basis, with no permanent locus for organizing U*S* partic- 
ipation* 

The Educational Assessment Institute would seek to organize all U*S* 
participation in the International Association by providing staff time, 
access to the NAEP exercises and data base, and logistical support for 
U*S* contributions to international assessment endeavors* If sufficient 



funding can be located. Institute scholars might participate In or even 
organize International data exchange ^ analysis, and reporting efforts'* 

Institute Organisation s 

The proposed Educational Assessment Institute would eventually be 
Incorporated as an Independent not'-f or-prof It organization under the 
leadership of a permanent staff and Board. As mentioned previously, this 
Board would be composed of nationally-recognized assessment scholars 
representing major professional associations Stanford University, and 
AIR. During Its Initial start-up phase, the Institute would probably be 
organized as a center under the aegis of the Stanford School of Education, 
and Its Initial Director would be selected from among the faculty (active 
or emeriti) of that school. After the Board had organized. It would 
select a permanent Director, who% In turn, would select the Institute 
staff. We anticipate that some staff affiliations might be part-time, 
allowing access to Stanford University faculty and AIR professional staff 
having other research and teaching commitments. 

The Institute would be physically situated either on the Stanford 
University campus or In quarters near the campus, providing easy access to 
both the School of Education and AIR. 



Institute Funding 

We assume that Initial support for Institute planning and fundralslng 
activities would be provided by NIE through a modest line Item In the NAEP 
contract budget. These Initial NAEP planning funds would be used to seek 
permanent funds from private foundations having Interests in the Improve-^ 
ment of American education. 

The initial planning period would extend approximately eight months, 
during which the following activities would be carried out, under the 
overall direction of the Dean of the Stanford University School of Edu- 
cation: 



• appoint an Interim Director; 

• solicit nominations for and select the Institute Board; 

• prepare an Institute prospectus and circulate It widely to 
private funding sources with which contacts have already 
been established through previous support; 

• prepare a detailed proposal for ^500,000 to provide three^ 
years of core support ja^nd submit It to those foundations 
expressing Interest In the preliminary prospectus; and 

• obtain funding to establish the Institute, with Initial 
functions and staff size to be determined by the level of 
funding achieved. 



Present Status of Institute Planning 

The Stanford School of Education has already drafted and prepared for 
circulation a preliminary Institute prospectus. This prospectus will 
shortly be sent to senior staff of several foundations known to have 
priorities In related areas. In the event expressions of Interest are 
received as a result of these preliminary Inquiries, NIB will be Imme- 
diately notified. . 



-68- 



7i 



References 



Bock, R. D. Basic Issues In the measurement of change. In de Gruljter, 
D. M, & van der Kamp, L. J. T. (Eds> ), Advances In psychological 
and educational measurement s New York: Wiley, 1976, 75-110. 

Bock, R. D., & Mtelevy, R. J. An Item response curve model for matrix- 

8anpllng''^d«a: The California grade three assessment. New Directions 
for Testing and Measurement , 1981,^10, 65-90. 

Bock, R. D., Mlslevy, R. J. , & Woodson, C. The next stage In educational 
assessment. Educational Researcher , 1982, JL1(3), 4-11. 

Cronbach, L. J., & Snow, R. Aptitudes and Instructional methods: A hand- 
book for research on Interactions . New York: Irvlngton, 1977. 

Cronbach, L. j., et al. Toward reform of program evaluation . San 
Francisco: Jossey-Bass, 1980. 

OuBols, P. H. (Ed.) Invitational conference on testing problems . 
Princeton, NJ: Educational Testing Service, 1969. 

Education Commission of the States. A guide to National Assessment objec 
tlves and Items (SY-OI-36). Denver, CO: Author, 1981. 

Flanagan, J. C The critical-requirements approach to educational objec- 
tives. School and Society , 1950, 71, 321-324. 

Flanagan, J. C (Ed.). Perspectives on Improving education: Project 
TALENT* s young adults look back . New York: Praeger, 1978. 

Flanagan, J. C, & Russ-Eft, D. F. An empirical study to aid In formulat- ' 
Ing educational goals . Palo Alto, CA: American Institutes for 
Research, 1975. 

Caret, M. S. External monitoring" and the management of educational 

evaluations: Lessons learned from the Huron Institute's Involvement 
In Follow Through" Washington, D.C« : U.S. Office of Education, 
1978. (Contract Report 0EC-0-74-0394) 

Green, D. R. (Ed.) The aptitude-achievement distinction . Monterey, CA: 
CTB/McGraw Hill, 1974. 

Greenbaum, W. , Caret, M. , & Solomon, E. Measuring educational progress . 
New York: McGraw-Hill, 1977. 

Haertel, E. H. Determining what Is measured by multiple choice tests of 
reading comprehension . Unpubllsheed doctoral dissertation. 
University of Chicago, 1980. x 




-69- 



73 



Haertel, E. H. Developing a discrete ability profile model for mathe- 
matics attainment * Final report submitted pursuant to Grant 
NIE-G-80-OOa3 submitted to National Institute of Education, 1981. 



Lazarsfeld, P. F., & Henry, N. W. Latent structure analysis * New York: 
Houghton Mifflin, 1968. 

Lord, F., & Novick, M. Statistical theories of mental test scores . 
Reading, MA: Add 1 son-Wesley, 1968. 

Mandler, G., & Sarason, S. A study of anxiety and learning. Journal of 
Abnormal and Social Psychology , 1952, 47, 166-173. 

Mlslevy, R. J., Reiser, M., & Zlmowskl, M. Scale score reporting of 

national assessment data . Final report to the Education Commission 
of the States under contract #02-81-20314. Chicago: International 
Educational Services, 1981* 

Murray, C A. The ERS review panel for AIR's Cities In Schools evaluation . 
Paper presented at the annual meeting of the Evaluation Research 
Society, Rosslyn, Virginia, 1980. 

PLAN master objectives * Palo Alto, CA: Westlnghouse Learning Corporation, 
1971. 

Reiser, M. A latent trait model for group effects . Unpublished doctoral 
dissertation. University of Chicago, 1980. 

Rotter, J. Generalized expectancies for Internal versus external control 
of reinforcement. Psychological Monographs , 1966, 80 (1, Whole num-. 
ber 609). 

Snow, R. Aptitude and achievement. New Directions for Testing and Mea- 
surement , 1980, 2» 39-59. 

University of Texas. Adult functional competency: A summary . >^ Austin, TX: 
University of Texas, 1975. 



ERIC 



7o 



-70- 



1 


1 * 

APPENDIX A , ' 


• 


PLAN 




mm ^^^Mmm WkWM mm 

Master 




Objectives 

January, 1971 






ERLC 


Westinghouse Learning Corporation 
PLAN Division 

\\ ^ -71- 



FOREWORD 



A huge advantage of an instructional objective derives from the simple fact that it 
is written down. Once it is written, it is visible. Once it is visible, it can be reviewed, 
evaluated, modified, improved. 

Objectives are frequently discussed, but seldom seen. In thfise volumes you can see 
some four thousand instructional objectives in the subject areas of Language Arts, 
Mathematics, Science, and Social Studies, extending over the range from grade one 
through grade twelve. This collection represents the cooperative efforts of over one 
hundred classroom teachers and an almost equal number of staff members at the 
American Institutes for Research and the Westinghouse Learning Corporation, 

Since these volumes present written objectives rather than offer a discussion about 
objectives, they become the criteria by which materials are selected. Content outlined, 
instructional procedures and educational technology developed, and tests and examinations 
prepared. All aspects of an educational program are really the mear^to accomplishing 
the basic educational purpose. This collection serves to stimulate teachers and educators 
in selecting and developing behavioral objectives for their local use. These objectives may 
be criticized and evaluated, revised and modified, additions made to or objectives 
deleted; all with the view of arriving at an appropriate set of educational outcomes to 
meet the educational needs of a local situation and of individual students. 

The rather obvious purpose of an instructional objective should be to make clear to 
teachers, students, and other interested persons what youngsters should be able to do 
as a result of the instructional program, A well-written instructional objective should 
specify under what conditions and to what extent a certain kind of student performance 
can be expected to take place. 

Unfortunately, school systems commonly lack a comprehensive and reasonably consistent 
set of educational objectives. Educational goals and objectives are quite frequently 
expressed only in broad, global terms, and the question of what and how to teach is 
left to a considerable extent to the teacher. As a result, quality in the schools is closely 
associated with the qualified and artful teachers. No doubt considerable excellent 
educational work is done by artistic teachers who, while they do not have a clear 
conception of goals, do have an intuitive sense of what is good teaching. Their 
materials are significant, and they develop topics effectively with students. The artistic 
teacher clarifies the educational objectives (even those not directly stated) through 
her actions as she teaches intuitively. 

If the foregoing were to serve as a basis for defining education, then the "intuitiveness 
of the artistic teacher" would have to be built into the educational program. This, of 
course, cannot be done. The alternative is to start with clearly defined, rather than 
implied, instructional objectives. 

Educational objectives-even clearly stated, specific objectives-are in the final analysis 
matters of choice and thus are value judgments. The question then arises: 
Who provides these value judgments? In the last analysis, the public 
schools are operated to meet the needs of society. Some of the 
objectives and who shall attend school are provided for in the state 
constitutions and by laws. Others are set forth by the efforts of the 
elected representatives of the people of a community. Others are 
provided by the professional educators hired to operate the schools, 

/ 



•v- 



Others come from our knowledge of the children themselves and how 
they learn. These effectively furnish the sources* of educational 
objectives for a local public school. They will change with the c^anging 
conditions of the times; sometimes fast, as with Sputnik, but usually 
slowly. 

In evaluating and summarizing instructional objectives, whatever their source, certain 
kinds of information and knowledge provide a more intelligent basis for making decisions 
about objectives. If these facts are known and understood, the probability is increased 
that judgments about objectives will be wise and that the school goals will have greater 
significance, objectivity, and validity. For this reason, a large part of the so-called 
scientific study of the curriculum has concerned itself with investigations that may 
provide a more adequate basis for selecting instructional objectives wisely. 

The question is then raised as to what sources can be used for getting information that 
will be helpful. A good deal of controversy goes on between essentialists and progressives, 
between subject specialists and child psychologists, between sociologists and the philo- 
sophers, between this school group and that school group, over the question of the basic 
source from which objectives can be derived. The progressives and child psychologists 
emphasize the importance of studying the child to find out what kinds of interests 
he has, what problems he encounters, what purposes he-has in mind. They see this 
information as providing the basic source for selecting objectives. The essentialists 
and subject specialists, on the other hand, are impressed by the large body of knowledge 
collected over many thousands of years, the so-called cultural heritage, and emphasize 
this as the primary source for deriving objectives. They view objectives as essentially 
the basic learnings selected from the vast cultural heritage of the past. 

Many sociologists and others concerned with the pressing problems of contemporary 
society see in an analysis of today's world the basic information from which objectives 
can be derived. They view the school as the agency for helping young people to deal 
effectively with the critical problems of modern life. If they can determine what the 
existent problems are, then the objectives of the school are to provide these knowledges, 
skills, attitudes, that will help people to deal intelligently and effectively with contemporary 
problems. On the other hand, the educational philosophers recognize that there are basic 
values in life, largely transmitted from one generation to another by means of education. 
They see the school as aiming essentially at the transmission of the basic values derived 
by comprehensive philosophic study and hence they see in educational philosophy the 
basic source from which objectives can be derived. 

The point of view recommended is that no single source of information is adequate to 
provide a basis for wise and comprehensive decisions about the objectives df thje school. 
Each of these sources has certain values to commend it. Each source should be given 
consideration in planning. In this way educational programs may be developed that 
are flexible and suitable for any specific public school situations irrespective of whether 
the situation is influenced primarily by only one or any combination of these varying 
points of view concerning educational objectives. 

While the objectives in these volumes contribute to solving the difficult problem of 
delineating a curriculum, they should not be considered as a final and perfect product. 
Any set of objectives may in fact be considered tentative, requiring continuous updating 
and recvaluation to the educational purposes and programs at hand. To have critical 
comments made about one's objectives should be taken as a compliment, since this can 
only happen when one has taken the trouble to think them out and rite them down. 



.v|. 



In spite of the great effort and man-hours that have gone into this task of compiling 
the objectives in these volumes, a number of the objectives listed cannot yet be considered 
to be "true" objectives (if by objectives we mean instructionaT outcomes described in 
performance terms). In fact, the editors wish to make the following critical comments 
as to some of the reasons why some of the objectives herein contained are open to 
multiple interpretation. 

1 . Some describe a classroom activity taking place during the process 3t learning, 
rather than the performance to be exhibited by the proficient student after 
learning. 

2. Some lack a description, or even a suggestion of, the stimulus conditions under 
which a student is to perform. Conversely (and perversely), stimulus conditions 
are occasionally included when seemingly unimportant. 

3. Some statements (I use that term rather than objectives) fail to suggest any 
sort of criteria. Though all objectives da not demand criteria, this lack, perhaps 
more than anything else, makes for vagueness. 

The objectives in these volumes are the objectives for Project PLAN with slight editorial 
and organizational modifications. Project PLAN is a system of individualized education 
operative at grades one through twelve in the subject areas of language arts, mathematics, ' 
science, and social studies. Project PLAN was conceived by Or. John C. Flanagan and 
to an extent evolved from the findings of Project TALENT, a large*scaie, long-range 
project involving the collection of comprehensive information about education in the 
United States. Project TALENT involved the testing of a sample of 440,000 students 
in 1,353 secondary schools in all parts of the country in March, 1960, with subsequent 
follow-up studies. Th/ough Or. Flanagan's efforts. Project PLAN was brought into 
being in February, 1967, as a joint effort of the American Institutes for Research, 
Westinghouse Learning Corporation, and thirteen school districts.* Or. Flanagan 
has continued to direct the developmental and research work on Project PLAN since 
that date and is an editor of these volumes. Assisting in the developmental work of 
Project PLAN has been Or. Robert F. Magir. Or. Mager is well known for his book. 
Preparing Instructional Objectives,^ and his philosophy was followed in the development 
of the objectives in these volumes, of which he is an editor. 

The cooperating school districts furnished classroom teachers each year from 1967 
through June 1970 who developed the objectives and prepared the Teaching-Learning Units 
to accomplish the objectives under the sgpervision of American Institutes for Research 
and Westinghouse Learning Corporation professional personnel. The director of these 
activities was Or. William M. Shanner, the third editor of these volumes. The teachers, 
at the end of each 9aar, returned to their respective school districts to initiate the 
instructional programs organized from the objectives. 



1 Archdiocese of San Francisco, Department of Education. San Francisco, California; Fremont 
Unified School District, Fremont, California; San Carlos Elementary School District, sian Carlos. 
California; San Jose Unified School District. San Jose, California; Santa Clara Unified School District. 
Santa Clara. California; Sequoia Union High School District. Redwood City. California; Union 
Elementary School District. San Jose, California; Bethel Park School District. Bethel Park, Pennsylvania; 
Hicksville Public School District, HIcksville, New York; Penn Trafford School District, Harrison City, 
Pennsylvania; Pittsburgh Public Schools. Pittsburgh, Pennsylvania; Quincy Public Schools, Quincy, 
Massachusetts; Wood County Schools. Parkersburg, West Virginia. 

2, Magtr, R.F. Pr9psring Instructional Objectives, Palo Alto: Fearon Publishers, 1962. 



•VM- 



The objectives in these volumes, then, have originated from teachers and have been 
tried out in schoors, I wish to acknowledge the efforts of those teachers who were 
assigned by their school districts to work a year at the American Institutes for Research 
in Palo Alto, without whose contributions the objectives in these volumes would not 
have been possible. 

Archdiocese of San Francisco, Department of Education: Sister Maura 
Cole, Marian Bonnet, Janice Edminster, Sister Charlene Foster, Sister Bernice 
Hein'z, Sister Patricia Hoffman, Sister Mary Vincent Gularte, Sister Anita Kelly, 
Sister Jeanne Marie Sosic 

Bethel Park School District: Lora Moroni, Gordon Lepri, James Johnson, Judith 
Andrews, Flora Belle Faddis, David Loadman, Mary Lou Ertman, Roger Johnson, 
Robert N. Manson, Anna Marie Kerlin, Frances Chase, Robert M, Caldwell 

Fremont Unified School District: Lyndall Sargent, Gail Pagan, Rex W. Estes, 
Caroline Breedlove, Monique Lowy, Charles Swanson, Eileen Trefz, Robert 
Fairlee, Beverly Ulbricht, Forrest W. Dobbs, Roy C. Fields, Bertram K. Robarts 

Hicksville Public School District: Elayne Kabakoff, Richard C. Leuci, Terrence 
Boylan, Janet Findlay, Willard Prince, Edward Albert, Phyllis A. Kabakoff, ' 
Lawrence Dauch, Gerald Shanley, Marjorie Giannelli, Tom Bannan, Gerard F. 
Irwin 

Hughson Union High School District: Warren Green 

Penn-Trafford School District: Gary Fresch, Mary Ann Kovaly, Michael Demko, 
Jack Reilly, Victor Bohince, David Garvin, LaVelle Hershberg, R. Bruce Robinson 

Pittsburgh Public Schools: Ann'Mulroy, Jean Brooke, Kenneth Fraser, Shirley 
Fullerton, Ruth Aaron, Donald Coudriet, Cecilia Sukits, Cartnen Violi, Samuel D. 
Martin, Paul J. Schafer, Mary South, Patricia Sellars 

Quincy Public Schools: Jean Ann MacLean, Priscilla A. Dauphinee, Francis 
Keegan, Katharine Norris, Dennis Carini, Richard Russell, Stephen Fishman, 
Jack K. Merrill, Marj:ia A. Mitchell, Robert J. Mattsson, Margaret E.^FIynn 

San Carlos Elementary School District: Helen Dodds, Natalie Klock, Edith Bryant, 
Maxine Ross, Elizabeth Movinski, Martha A. Elmore", Charles B. Whitlock, Betty 
Lee, Lee Jensen 

San Jose Unified School District: Allaire Bryant, R(se Berry, Hal Garrett, Kathy 
Roberts, William Harvel, Judy Opfer, Judi Wells, Don Qrowell, Oran T. Adams, 
^ Marilyn D. Johnson, Alice S. Anderson, Sylvia Atallah 

Santa Clara Unified School Qistrict: Nancy Wylde, Ruth Hessenf lovy, Arthur A. 
Hiatt, Herman Neufeld 

Sequoia Union High School District: Gale Randall, Rex Fortune, R*>bertW. DuBois 

Union School District: Jo Ann Risko, Peggy Schwartz, Rose Yamasaki, Glenn 
Moseley, Sue Coffin, Tod Hodgdon, Barbara S. Donley, Frank Kelly 

Wood County Schools: Roberta Adkins, Mary Rector, Larry Myers, Virginia 
Haller, John Hoyes, Connie Chapman, Ada Ardelia Price, David V. Westfall, 
Nancy M. Rice, John W. Apgar 



-viii- 



81 



In addition, the contributions of the following persons should be acknowledged: 
Mary June Erickson, language arts; Josephine Matthews, Dr. Marie Goldstein, and 
Dr. Gordon McLeod, mathematics; Marvin Patterson, science; Dr. Vincent N. Campbell, 
sociaistudies; Sarah Russell, primary; Kather^n Wocdiey, Dr. Mary Willis, Debbra 
Michaels, performance standards; and Dr. Helen Dell, editorial. 

Final acknowledgment should'go to those who use the objectives in these volumes. 
Objectives alone, an educational program, they do not n\ake. They provide at best 
only a framework. The responsibility for the learning must rest on the student, guided 
by the teacher, and supervised by the school administration. 



William M. Shanner 



Palo Alto, California 
December 15, 1970 



8^ 




CTION 



The PLAN Master List of Objectives h^s-beef<prepared as a reference b6ok for PLAN 
teachers and administrators. The book is divided into three sections. Objectives 
in the first section are for primary level concepts and skills, in the second section for 
intermediate level and in the third for secondary level. The objectives in each section 
of the book are organized into one of four subject areas, mathematics, science, social 
studies and language arts. Many of the objectives are found in more than one subject 
area, and these are indicated by an asterisk. 

Each subject area has been subdivided into major 'concept sections. Within these 
sections are terminal and transitional objectives. Terminal objectives are defined as 
major growth points in the cognitive, skill and affective development of students. . 
Educators can specify terminal objectives as the ones they wish their students to achieve 
at the end of^ definite time block. Such a time block might be at the end of third' 
grade, at thelnd of eighth grade and at the end of the high school experience. 

Transitional objectives are listed under each terminal objective. They are defined as 
short term behavioral objectives, that is, concepts and skills to learn as prerequisites 
to the achievement of a terminal objective. A student may spend six months or several 
years achieving a series of transitional objectives before he is ready to challenge the 
terminal objective. Transitional objectives are organized sequentially under a terminal 
objective if sequence is important, and are clustered when they have a common theme. 
It is not necessary to achieve every transitional objective before challenging the terminal 
objective Since many transitional objectives are cross-referenced in several concept 
areas, students may achieve transitional objectives which will simultaneously support 
several terminal objectives. V - 

Each objective has been written to prescribe one of six designated levels of performance 
The performance levels, based on Bloom's Taxonomj/ of Educational Objectives are 
indicated by the roman numerals at the end of -each objective, as in the following ; 
examples: ' 

2212 Aft^r reading a fictional selection at th6 appropriate reading level, 
predict future con§ec<uences. (Ill) 

5045 Analyze a given selection by inferring the author's intent and by drawing 
conclusions from the evidence presented. (IV) 

The verbs used in the objectives are standardized to each performance level as defined 
in the glossary of this book. Some of the objectives are worded differently than they 
are in the present TLU's to correspond with the revised performance level and standard- 
ized verb Ijst. 



1. Benjamin S. Bloom, Taxonomy of Educational Ob/ectives. David McKay Company, fnc. 
New York, 1956. ° " ( 



ERIC 



83 



PERFORIVIANCE LEVELS 



LEVEL I This level requires only memorization of factual information or 
major topic headings. Questions on this level would not require opinions or 
interpretation of facts. An example of 'level I would be: 



0352 Identify the" following properties of animals: how they eat, how^they grow, , 
how they change, how they move by themselves, and how they have babies^ (I) 



The following verbs are used in lev^l I objectives: 



Language Arts, Social Studies, Science 



answer questions locate 

copy list 

define match 

ffinish name 

follow directions pronounce 

identify , reproduce, 

indicate select 
label « spell 



tell (retell)- 



Mathematics 



copy 
finish 
identify 



list 

match 

reproduce 

tell 



LEVEL II Objectives designated as level II require the comprehension of information 
or the use of a skill in a different context from the original. Students may be required 
to classify, describe, or interpret information. If a student is required to recognize 
information, he knows that the suggested answers will not be taken verbatim from 
the original material but will be given in a different context. This is in contrast to 
the use of the verb "identify" in level I which indicates a memorization of facts or 
concepts. A student may also be asked to draw conclusions or summarize information 
in level II. Tasks in mathematics at this level include solving numerical problems 
and algebraic equations and making graphs. If a formula is suggested for solving 
a problem, it is considered to be level U, differentiating the task from one where 
a choice of formulas or operations must be made. Writing sentences with the 
appropriate sentence structure or grammar usage is also considered level II. An 
example of this level would be: 

4940. Explain what is meant by adapting to environment (both biologically and 

culturally), and cite two examples of races adapting to their environments. (II) 

The following verbs are used in level II objectives: 



^ Language Arts, Social Studies, Science 



classify order 

complete read 

construct recognize 

describe suggest 

drawxonclusions summarize 

explain use 

express* relate 

interpret rewrite ■ 



V 



84 



Additional verbs used only in Mathematics and Science 
at this level: 



add 

calculate 

conclude 

construct 

count 

define 

derive 

divide 

estimate 

expand 

factor 

find 

graph 

illustrate 

infer 



integrate 
measure 
multiply 
organize 



plot / 
put in order 
record 



represent 
simplify 



solve problems, equations 
square 



subtract 
test 



translate 
verify 



write numerals 



Objectives designated in levels III to VI require a higher cognition level than the two ^ 
previous levels. Intermediate and secondary terminal objectives are found principally 
in these higher categories. These levels may require a certain amount of memorizing 
or simple comprehension but the end result should be a more complex cognitive performa 

LEVEL II I Objectives indicated as level 1 1 1 performance require the student to 
make an application of a principle, concept or skill. The student will be required 
to choose from several possible principles, formulas or concepts to demonstrate 
this performance level. The writing to be completed for-objectives designated as 
level III does not require a creative effort. Writing is used instead to demonstrate 
^an understanding of ideas, to support or refute a solution to a problem or to prepare 
^n oral presentation. Objectives which require the ability to predict consequences 
and employ experimental procedures in finding solutions to problems designate 
level III performance level also. An example of this level would be: 

8415 Present evidence from world history to support or refute this statement: 
Too much involvement in foreign affairs over too long a time weakens 
a nation internally. (Ill) 

The following verbs are used in level lit objectives: 



Language Arts, Social Studies, Science 



act out -I participate 

apply predict 

communicate prepare and present 

debate present 

demonstrate pretend (role playing, perform) 

discuss support or refute a solution 

find (information) take notes 

keep records write 

make, draw , 



-xii- 



Additional verbs used only in Mathematics and Science 
at this level: 

approximate prove 
determine ' select 
differentiate solve word problems, 

evaluate problem situations 

perform. tabulate 

write equations, problems, 
number sentences 

LEVEL IV Objectives indicating" level IV performance require an organization or 
analysis of ideas in a far more complex manner than the lower performance levels. 
Analyzing an author's writing invplves making inferences about the author's interest 
or convictions, his freedom from bias or the validity of his argumehtt? Some objectives 
of this level require students to distinguish facts from hypothesis and factual 
statements from normative statements. The analysis of elements to show relation- 
ships and the forming of generalizations are cognitive skills which are also a part 
of this performance level. An example of level hV woulrfbe: 

5256 Using the mass media as resources, analyze two or more viewpoints on a 
controversial issue. (IV) 

The following verbs are used in level IV objectives: 

Language Arts, Social Studies, Science, Mathematics 

analyze form generalizations 

determine infer 
differentiate organize 

LEVEL V Students are asked to use their creative skills when achieving objectives 
designated as level V. They may combine and organize ideas in a unique ^A^ay, design 
a plan for solving a problem, develop a new formula, or write an original composition. 
This requirement is distinguished from the writing in level III objectives by the 
creativity it involves in contrast to the reporting of facts and observations. Many 
of the secondary terminal objectives require this level of performance. An example 
bi^evel V would be: 

5269 Design, set up, and perform an experiment that will demonstrate that 
there is a 2:1 hydrogen to oxygen ratio in water. (V) 

The following verbs are used in level V objectives: 

. ' Language Art^, Social Studies, Science, M-athematics 

combine and organize produce 

design wfite (original composition) 

develop ^ 

LEVEL VI Performance level VI .requires the most complex skills anJl concep- 
tualization of ideas in the evaluation of plans, procedures, techniques or solutions 
to problems. Evaluations at this level are made on the basis of specific criteria. 
They cannot be made without a thorough consideration of all the facts and of the 
effect that ideas may have on efficiency, economy, utility and human problems. 
Terminal objectives requiring this performance level are supported by transiti|Dnal 



-xiii- 8G 

r 



f 



objectives requiring the analysis of ideas, solutions to problems, and procedures. 
Although evaluation objectives are of the highest cognitive level, they may also 
be transitional objectives which enable the students to select the most appropriate • 
ideas or techniques to produce a creative work in a terminal objective. An example 
of level VI would be: 

2374 After reading a book at an appropriate reading level, evaluate the validity 
of the message in terms of personal experience. (Vl) 

The following verbs are used in level VI objectives: ^ 
Language Arts, Social Studies, Science, Mathematics 

compare and contrast ^ 
evaluate 

make judgments 

Educators can use the PLAN Master List of Objectives as a reference^ book in th'^^evaluation 
of goals for their students. The index provides a reference from module to concept 
organization. The modules are listed in the index sequentially by number. The objective 
numbers refer to page numbers where each can be found. The reader will then be able 
to find the termirwl objectives which each transitional objective supports. 



'Helen D, Dell, Editor 



0 



I. 



7 




-XIV- 



ERIC 



87 / 

/ 



